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ESSENTIAL BACTERIAL GENES AND THEIR USE 

Background of the Invention 
The invention relates to essential bacterial genes and their use in identifying 
antibacterial agents. 
5 Bacterial infections may be cutaneous, subcutaneous, or systemic. 

Opportunistic bacterial infections proliferate, especially in patients afflicted with 
AIDS or other diseases that compromise the immune system. The bacterium 
Streptococcus pneumonia typically infects the respiratory tract and can cause lobar 
pneiunonia, as well as meningitis, sinusitis, and other infections. 

10 Sunrmiarv of the Invention 

The invention is based on the discovery of 23 genes in the bacteritmi 
Streptococcus pneumoniae, and a related gene in the bacterium Bacilli^ subtilis, 
that are located within operons that are essential for survival. These 23 
Streptococcus genes are referred to herein as "GEP genes" (which stands for 

15 general essential protein); for convenience, the polypeptides encoded by these genes 
are referred to herein as "GEP polypeptides." Each GEP gene is located within an 
operon that contains a gene that is essential for survival of Streptococcus 
pneumoniae*, the essential gene can be the GEP gene or another gene located within 
the same operon. Bacterial operons contain several genes that are related, e.g., 

20 with respect to function or biochemical pathway. Transcription of an operon leads 
to the production of a single transcript in which multiple coding regions are linked. 
Thus, an operon containing one or more essential genes can be considered an 
"essential operon," since disruption of expression of one gene located within the 
operon will interfere with expression of the other genes in the operon. Each coding 

25 region of the transcript is separately translated into an individual polypeptide by 
ribosomes that initiate translation at multiple points along the transcript. Having 
identified one gene in the operon, one can readily identify and sequence the other 
genes located within the operon. 
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The genes encoding the GEP polypeptides are useful molecular tools for 
identifying similar genes in pathogenic microorganisms, such as pathogenic strains 
of Bacillus, In addition, the operons containing genes encoding GEP polypeptides, 
and the polypeptides encoded by such operons, are useful targets for identifying 
5 compounds that are inhibitors of the pathogens in which the GEP polypeptides are 
expressed. Such inhibitors inhibit bacterial growth by being bacteriostatic (e.g., 
inhibiting reproduction or cell division) or by being bacteriocidal (i.e., by causing 
cell death). 

The invention, therefore, features an isolated polypeptide encoded by a 

10 nucleic acid located within an operon encoding a GEP polypeptide, termed gepl03, 
having the amino acid sequence set forth in SEQ ID N0:1, or conservative 
variations thereof. An isolated operon comprising a nucleic acid encoding gepl03 
also is included within the invention. In addition, the invention includes an 
isolated nucleic acid of (a) an operon comprising the sequence of SEQ ID N0:2, as 

15 depicted in Fig. 1, or degenerate variants thereof; (b) an operon comprising the 
sequence of SEQ ID N0:2, or degenerate variants thereof, wherein T is replaced by 
U; (c) nucleic acids complementary to (a) and (b); and (d) fragments of (a), (b), 
and (c) that are at least 15 base pairs in length and that hybridize under stringent 
conditions to genomic DNA encoding the polypeptide of SEQ ID N0:1. As 

20 described above for geplOS, other nucleic acids and polypeptides encoded by 
nucleic acids located within operons encoding GEP polypeptides are included 
within the invention, including: (a) operons comprising the nucleic acids 
represented by the SEQ ID NOs. listed below, as depicted in the Figures listed 
below, or degenerate variants thereof; (b) operons comprising the nucleic acids 

25 represented by the SEQ ID NOs. listed below, wherein T is replaced by U; 

(c) nucleic acids complementary to (a) and (b); and (d) fragments of (a), (b), and 
(c) that are at least 15 base pairs in length and that hybridize under stringent 
conditions to genomic DNA encoding the polypeptides represented by the SEQ ID 
NOs. listed below. 
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Table 1: GEP nucleic acids and polypeptides 
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The invention also includes allelic variants (i.e., genes encoding isozymes) 
of the genes located within operons encoding the GEP polypeptides listed above. 

5 For example, the invention includes a gene that encodes a GEP polypeptide but 
which gene includes one or more point mutations, deletions, promoter variants, or 
splice site variants, provided that the resulting GEP polypeptide functions as a GEP 
polypeptide (e.g., as determined in a conventional complementation assay). 
Identification of these GEP genes and the determination that they are 

10 located within operons containing an essential gene allows homologs of the GEP 
genes to be found in other organisms strains of Streptococcus, Also, orthologs of 
these genes can be identified in other species (e.g., Bacillus sp.). While 
"homologs" are structurally similar genes contained within a species, "orthologs" 
are functionally equivalent genes from other species (within or outside of a given 

15 genus, e.g., from Bacillus subtilis or E, coli). Such homologs and orthologs are 
expected to be located within operons that are essential for survival. Such 
homologous and orthologous genes and polypeptides can be used to identify 
compoxmds that inhibit the growth of the host organism (e.g., compoimds that are 
bacteriocidal or bacteriostatic against pathogenic strains of the organism). 

20 Homologous and orthologous genes and polypeptides that are essential for survival 
can serve as targets for identifying a broad spectrum of antibacterial agents. 

An ortholog of gepl493, termed B-yneS, has been identified in 5. subtilis 
and is essential for survival of 5. subtilis. The amino acid sequence (SEQ ID NO; 
70), coding sequence (SEQ ID N0:71), and non-coding sequence (SEQ ID NO:72) 
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of B-yneS is set forth in Fig. 24. As with the other polypeptides and genes 
disclosed herein, the B-yneS polypeptide and gene can be used in the methods 
described herein to identify antibacterial agents. 

The term gepl03 polypeptide or gene as used herein is intended to include 
5 the polypeptide and gene set forth in Fig. 1 herein, as well as homologs of the 
sequences set forth in Fig. 1. Also encompassed by the term gepl03 gene are 
degenerate variants of the nucleic acid sequence set forth in Fig. 1 (SEQ ID NO:2). 
Degenerate variants of a nucleic acid sequence exist because of the degeneracy of 
the amino acid code; thus, those sequences that vary from the sequence represented 

10 by SEQ ID N0:2, but which nonetheless encode a gepl03 polypeptide are included 
within the invention. Likewise, because of the similarity in the structures of amino 
acids, conservative variations (as described herein) can be made in the amino acid 
sequence of the gepl03 polypeptide while retaining the function of the polypeptide 
(e.g., as determined in a conventional complementation assay). Other gepl03 

15 polypeptides and genes identified in additional Streptococcus strains may be such 
conservative variations or degenerate variants of the particular gepl03 polypeptide 
and nucleic acid set forth in Fig, 1 (SEQ ID NOs:l and 2, respectively). The 
gepl03 polypeptide and gene share at least 80%, e.g., 90%, sequence identity with 
SEQ ID NOs:l and 2, respectively. Regardless of the percent sequence identity 

20 between the gepl03 sequence and the sequence represented by SEQ ID NOs:l and 
2, the gepl03 genes and polypeptides encompassed by the invention are able to 
complement for the lack of gepl03 function (e.g., in a temperature-sensitive 

4 

mutant) in a standard complementation assay. Additional gepl03 genes that are 
identified and cloned fi"om additional Streptococcus strains, and pathogenic strains 

25 in particular, can be used to produce gepl03 polypeptides for use in the various 
methods described herein, e.g., for identifying antibacterial agents. Likewise, the 
terms geplll9, gepll22, gepl315, gepl493, gepl507, gepl511, gepl518, gepl546, 
gepl551, gepl561, gepl580, gepl713, gep222, gep2283, gep273, gep286, gep311, 
gep3262, gep3387, gep47, gep61, and gep76 encompass homologs, conservative 

30 variations, and degenerate variants of the sequences depicted in Figs. 2-23, 
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respectively. Such homologs, conservative variations, and degenerate variants also 
are included within the invention. 

Since the various GEP genes described herein have been identified and 
shown to be located within operons that are essential for survival, the GEP genes 
5 and polypeptides encoded by nucleic acid sequences located within operons 
containing GBP genes and their homologs and orthologs can be used to identify 
antibacterial agents. More specifically, the polypeptides encoded by nucleic acid 
sequences located within operons containing GEP genes can be used, separately or 
together, in assays to identify test compounds that bind to these polypeptides. Such 
10 test compounds are expected to be antibacterial agents, in contrast to compounds 
that do not bind to these GEP polypeptides. As described herein, any of a variety 
of art-known methods can be used to assay for binding of test compounds to the 
polypeptides. The invention includes, for example, a method for identifying an 
antibacterial agent where the method entails: (a) contacting a polypeptide encoded 
15 by a nucleic acid sequence located within an operon containing a GEP gene, or 
homolog or ortholog thereof, with a test compoimd; (b) detecting bindmg of the 
test compound to the polypeptide or homolog or ortholog; and (c) determining 
whether a test compound that binds to the polypeptide or homolog or ortholog 
inhibits growth of bacteria, relative to growth of bacteria cultured in the absence of 
20 the test compound that binds to the polypeptide or homolog or ortholog, as an 
indication that the test compound is an antibacterial agent. 

In various embodiments, the GEP polypeptide is derived from a non- 
pathogenic or pathogenic Streptococcus strain, such as Streptococcus pneumoniae. 
Streptococcus pyogenes. Streptococcus agalactiae. Streptococcus endocarditis, 
25 Streptococcus faecium. Streptococcus sangus, Streptococcus viridans, and 

Streptococcus hemolyticus. Suitable orthologs of the Streptococcus GEP genes can 
be derived from the bacterium Bacillus subtilis. The test compound can be 
immobilized on a substrate, and binding of the test compound to the polypeptide or 
homolog or ortholog can be detected as immobilization of the polypeptide or 
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homolog or ortholog on the immobilized test compound, e.g., in an immunoassay 
with an antibody that specifically binds to the polypeptide. 

If desired, the test compound can be a test polypeptide (e.g., a polypeptide 
having a random or predetermined amino acid sequence; or a naturally-occurring or 

5 synthetic polypeptide). Alternatively, the test compound can be a nucleic acid, 
such as a DNA or RNA molecule. In addition, small organic molecules can be 
tested. The test compound can be a naturally-occurring compound or it can be 
synthetically produced, if desired. Synthetic libraries, chemical libraries, and the 
like can be screened to identify compounds that bind to the polypeptides. More 

10 generally, binding of test compounds to the polypeptide or homolog or ortholog 
can be detected either in vitro or in ^nvo. Regardless of the source of the test 
compound, the polypeptides described herein can be used to identify compounds 
that are bacterioidal or bacteriostatic to a variety of pathogenic or non-pathogenic 
strains. 

15 In an exemplary method, binding of a test compound to a polypeptide 

encoded by a nucleic acid located within an operon containing a GEP gene can be 
detected in a conventional two-hybrid system for detecting protein/protein 
interactions (e.g., in yeast or mammalian cells). Generally, in such a method, 
(a) the polypeptide encoded by a nucleic acid located within an operon containing a 

20 GEP gene is provided as a fusion protein that includes the polypeptide fused to (i) 
a transcription activation domain of a transcription factor or (ii) a DNA-binding 
domain of a transcription factor; (b) the test polypeptide is provided as a fusion 
protein that includes the test polypeptide fused to (i) a transcription activation 
domain of a transcription factor or (ii) a DNA-binding domain of a transcription 

25 factor; and (c) binding of the test polypeptide to the polypeptide is detected as 
reconstitution of a transcription factor. Homologs and orthologs of the GEP 
polypeptides can be used in similar methods. Reconstitution of the transcription 
factor can be detected, for example, by detecting transcription of a gene that is 
operably linked to a DNA sequence boimd by the DNA-binding domain of the 

30 reconstituted transcription factor {See, for example. White, 1996, Proc. Natl. Acad. 
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Sci. 93:10001-10003 and references cited therein and Yidal et al., 1996, Proc. Natl. 
Acad. Sci. 93:10315-10320). 

In an alternative method, an isolated operon containing a nucleic acid 
molecule encoding a GEP polypeptide is used to identify a compound that 
5 decreases the expression of a GEP polypeptide in vivo. Such compounds can be 
used as antibacterial agents. To discover such compounds, cells that express a GEP 
polypeptide are cultured, exposed to a test compound (or a mixture of test 
compounds), and the level of expression or activity is compared with the level of 
GEP polypeptide expression or activity in cells that are otherwise identical but that 
10 have not been exposed to the test compound(s). Many standard quantitative assays 
of gene expression can be utilized in this aspect of the invention. 

To identify compounds that modulate expression of a GEP polypeptide (or 
homologous or orthologous sequence), the test compoimd(s) can be added at 
varying concentrations to the culture medium of cells that express a GEP 
15 polypeptide (or homolog or ortholog), as described herein. Such test compounds 
can include small molecules (typically, non-protein, non-polysaccharide chemical 
entities), polypeptides, and nucleic acids. The expression of the GEP polypeptide 
is then measured, for example, by Northern blot PGR analysis or RNAse protection 
analyses using a nucleic acid molecule of the invention as a probe. The level of 
20 expression in the presence of the test molecule, compared with the level of 

expression in its absence, will indicate whether or not the test molecule alters the 
expression of the GEP polypeptide. Because the GEP polypeptides are expressed 
from operons that are essential for survival, test compounds that inhibit the 
expression and/or function of the GEP polypeptide will inhibit growth of the cells 
25 or kill the cells. 

Compounds that modulate the expression of the polypeptides of the 
invention can be identified by carrying out the assays described herein and then 
measuring the levels of the GEP polypeptides expressed in the cells, e.g., by 
performing a Western blot analysis using antibodies that bind to a GEP 
30 polypeptide. 
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The invention further features methods of identifying from a large group of 
mutants those strains that have conditional lethal mutations. In general, the gene 
and corresponding gene product are subsequently identified, although the strains 
themselves can be used in screening or diagnostic assays. The mechanism(s) of 
5 action for the identified genes and gene products provide a rational basis for the 
design of antibacterial therapeutic agents. These antibacterial agents reduce the 
action of the gene product in a wild type strain, and therefore are useful in treating 
a subject with that type, or a similarly susceptible type of infection by 
administering the agent to the subject in a pharmaceutically effective amount. 
10 Reduction in the action of the gene product includes competitive inhibition of the 
gene product for the active site of an enzyme or receptor; non-competitive 
inhibition; disrupting an intracellular cascade path which requires the gene product; 
binding to the gene product itself, before or after post-translational processing; and 
acting as a gene product mimetic, thereby down-regulating the activity. 
1 5 Therapeutic agents include monoclonal antibodies raised against the gene product. 
Furthermore, the presence of the gene sequence in certain cells (e.g., a 
pathogenic bacterium of the same genus or similar species), and the absence or 
divergence of the sequence in host cells can be determined, if desired. Therapeutic 
agents directed toward genes or gene products that are not present in the host have 
20 several advantages, including fewer side effects, and lower overall dosage. 

The invention includes pharmaceutical formulations that include a 
pharmaceutically acceptable excipient and an antibacterial agent identified using the 
methods described herein. In particular, the invention includes pharmaceutical 
formulations that contain antibacterial agents that inhibit the growth of, or kill, 
25 pathogenic Streptococcus strains. Such pharmaceutical formulations can be used 
for treating a Streptococcus infection in an organism. Such a method entails 
administering to the organism a therapeutically effective amount of the 
pharmaceutical formulation. In particular, such pharmaceutical formulations can be 
used to treat streptococcal pneumonia in mammals such as humans and 
30 domesticated mammals (e.g., cows, pigs, dogs, and cats), and in plants. The 
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efficacy of such antibacterial agents in humans can be estimated in an animal 
model system well known to those of skill in the art (e.g., mouse and rabbit model 
systems). 

Also included within the invention are polyclonal and monoclonal antibodies 
5 that specifically bind to the various GEP polypeptides described herein (e.g., 
gepl03). Such antibodies can facilitate detection of GEP polypeptides in various 
Streptococcus strains. These antibodies also are useful for detecting binding of a 
test compound to GEP polypeptides (e.g., using the assays described herein). In 
addition, monoclonal antibodies that bind to GEP polypeptides are themselves 
10 adequate antibacterial agents when administered to a mammal, as such monoclonal 
antibodies are expected to impede one or more functions of GEP polypeptides. 

As used herein, "nucleic acids" encompass both RNA and DNA, including 
genomic DNA and synthetic (e.g., chemically synthesized) DNA. The nucleic acid 
can be double-stranded or single-stranded. Where single-stranded, the nucleic acid 
15 may be a sense strand or an antisense strand. The nucleic acid may be synthesized 
using oligonucleotide analogs or derivatives (e.g., inosine or phosphorothioate 
nucleotides). Such oligonucleotides can be used, for example, to prepare nucleic 
acids that have altered base-pairing abilities or increased resistance to nucleases. 
An "isolated nucleic acid" is a DNA or RNA that is not immediately 
20 contiguous with both of the coding sequences with which it is immediately 
contiguous (one on the 5' end and one on the 3' end) in the naturally occurring 
genome of the organism from which it is derived. Thus, in one embodiment, an 
isolated nucleic acid includes some or all of the 5' non-coding (e.g., promoter) 
sequences that are immediately contiguous to the coding sequence. The term 
25 therefore includes, for example, a recombinant DNA that is incorporated into a 
vector, into an autonomously replicating plasmid or virus, or into the genomic 
DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a 
genomic DNA fragment produced by PGR or restriction endonuclease treatment) 
independent of other sequences. It also includes a recombinant DNA that is part of 
30 a hybrid gene encoding an additional polypeptide sequence. The term "isolated" 
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can refer to a nucleic acid or polypeptide that is substantially free of cellular 
material, viral material, or culture medium (when produced by recombinant DNA 
techniques), or chemical precursors or other chemicals (when chemically 
synthesized). Moreover, an "isolated nucleic acid fragment" is a nucleic acid 
5 fragment that is not naturally occurring as a fragment and would not be found in 
the natural state. As used herein, the term "isolated nucleic acid molecule" includes 
an operon containing a contiguous cluster of linked sequences. "Isolated operons" 
are those operons that are not naturally occurring and which are not associated with 
the sequences by which they are normally surrounded in a bacterial genome. 
10 A nucleic acid sequence that is "substantially identical" to a GEP nucleotide 

sequence is at least 80% (e.g., 85%) identical to the nucleotide sequence of the 
nucleic acid sequences represented by the SEQ ID NOs listed in Table 1, as 
depicted in Figs. 1-23. For purposes of comparison of nucleic acids, the length of 
the reference nucleic acid sequence will generally be at least 40 nucleotides, e.g., at 
15 least 60 nucleotides or more nucleotides. Sequence identity can be measured using 
sequence analysis software (e.g.. Sequence Analysis Software Package of the 
Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 
University Avenue, Madison, WI 53705). 

The GEP polypeptides useful in practicing the invention include, but are not 
20 limited to, recombinant polypeptides and natural polypeptides. Also useful in the 
invention are nucleic acid sequences that encode forms of GEP polypeptides in 
which naturally occurring amino acid sequences are altered or deleted. Preferred 
nucleic acids encode polypeptides that are soluble under normal physiological 
conditions. Also within the invention are nucleic acids encoding fusion proteins in 
25 which a portion of a GEP polypeptide is fused to an unrelated polypeptide (e.g., a 
marker polypeptide or a fusion partner) to create a fusion protein. For example, 
the polypeptide can be fused to a hexa-histidine tag to facilitate purification of 
bacterially expressed polypeptides, or to a hemagglutinin tag to facilitate 
purification of polypeptides expressed in eukaryotic cells. The invention also 
30 includes, for example, isolated polypeptides (and the nucleic acids that encode these 
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polypeptides) that include a first portion and a second portion; the first portion 
includes, e.g., a GEP polypeptide, and the second portion includes an 
immunoglobulin constant (Fc) region or a detectable marker. 

The fusion partner can be, for example, a polypeptide which facilitates 
5 secretion, e.g., a secretory sequence. Such a fused polypeptide is typically referred 
to as a preprotein. The secretory sequence can be cleaved by the host cell to form 
the mature protein. Also within the invention are nucleic acids that encode a GEP 
polypeptide fused to a polypeptide sequence to produce an inactive preprotein. 
Preproteins can be converted into the active form of the protein by removal of the 
10 inactivating sequence. 

The invention also includes nucleic acids that hybridize, e.g., under stringent 
hybridization conditions (as defined herein) to all or a portion of the nucleotide 
sequences represented by the SEQ ID NOs. listed in Table 1, or their complements. 
The hybridizing portion of the hybridizing nucleic acids is typically at least 15 
15 (e.g., 20, 30, or 50) nucleotides in length. The hybridizing portion of the 
hybridizing nucleic acid is at least 80%, e.g., at least 95%, or at least 98%, 
identical to the sequence of a portion or all of a nucleic acid encoding a GEP 
polypeptide or its complement. Hybridizing nucleic acids of the type described 
herein can be used as a cloning probe, a primer (e.g., a PGR primer), or a 
20 diagnostic probe. Nucleic acids that hybridize to the nucleotide sequences 
represented by the SEQ ID NOs. listed in Table 1 are considered "antisense 
oligonucleotides." Also included within the invention are ribozymes that inhibit the 
function of operons containing the GEP genes of the invention, as determined, for 
example, in a complementation assay. 
25 Also useful in the invention are various cells, e.g., transformed host cells, 

that contain a GEP nucleic acid described herein. A "transformed cell" is a cell 
into which (or into an ancestor of which) has been introduced, by means of 
recombinant DNA techniques, a nucleic acid encoding a GEP polypeptide. Both 
prokaryotic and eukaryotic cells are included, e.g., bacteria. Streptococcus^ Bacillus^ 
30 and the like. 
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Also useful in the invention are genetic constructs (e.g., vectors and 
plasmids) that include a nucleic acid of the invention which is operably linked to a 
transcription and/or translation sequence to enable expression, e.g., expression 
vectors. By "operably linked" is meant that a selected nucleic acid, e.g., a DNA 
5 molecule encoding a GEP polypeptide, is positioned adjacent to one or more 
sequence elements, e.g., a promoter, which directs transcription and/or translation 
of the sequence such that the sequence elements can control transcription and/or 
translation of the selected nucleic acid. 

The invention also features purified or isolated polypeptides encoded by 
10 nucleic acids located within operons containing GEP genes, as listed in Table 1. 
As used herein, both "protein" and "polypeptide" mean any chain of amino acids, 
regardless of length or post-translational modification (e.g., glycosylation or 
phosphorylation). Thus, the terms gepl03 polypeptide, geplll9 polypeptide, 
gepll22 polypeptide, gepl315 polypeptide, gepl493 polypeptide, gepl507 
15 polypeptide, geplSll polypeptide, gepl518 polypeptide, gepl546 polypeptide, 
gepl551 polypeptide, gepl561 polypeptide, gepl580 polypeptide, gepl713 
polypeptide, gep222 polypeptide, gep2283 polypeptide, gep273 polypeptide, gep286 
polypeptide, gep311 polypeptide, gep3262 polypeptide, gep3387 polypeptide, gep47 
polypeptide, gep61 polypeptide, and gep76 polypeptide include full-length, 
20 naturally occurring gepl03, geplll9, gepll22, gepl315, gepl493, gepl507, 
geplSll, gepl518, gepl546, gepl551, gepl561, gepl580, gepl713, gep222, 
gep2283, gep273, gep286, gep311, gep3262, gep3387, gep47, gep61, and gep76 
proteins, respectively, as well as recombinantly or synthetically produced 
polypeptides that correspond to the full-length, naturally occurring proteins, or to a 
25 portion of the naturally occurring or synthetic polypeptide. 

A "purified" or "isolated" compound is a composition that is at least 60% by 
weight the compound of interest, e.g., a GEP polypeptide or antibody. Preferably 
the preparation is at least 75% (e.g., at least 90% or 99%) by weight the compound 
of interest. Purity can be measured by any appropriate standard method, e.g., 
30 column chromatography, polyacrylamide gel electrophoresis, or HPLC analysis. 
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Preferred GEP polypeptides include a sequence substantially identical to all 
or a portion of a naturally occurring GEP polypeptide, e.g., including all or a 
portion of the sequences shown in Figs. 1-23. Polypeptides "substantially identical" 
to the GEP polypeptide sequences described herein have an amino acid sequence 
5 that is at least 80% (e.g., 85%, 90%, 95%, or 99%) identical to the amino acid 
sequence of the GEP polypeptides represented by the SEQ ID NOs. listed in Table 
1. For purposes of comparison, the length of the reference GEP polypeptide 
sequence will generally be at least 16 amino acids, e.g., at least 20 or 25 amino 
acids. 

10 In the case of polypeptide sequences that are less than 100% identical to a 

reference sequence, the non-identical positions are preferably, but not necessarily, 
conservative substitutions for the reference sequence. Conservative substitutions 
typically include substitutions within the follov^ng groups: glycine and alanine; 
valine, isoleucine, and leucine; aspartic acid and glutamic acid; asparagine and 

15 glutamine; serine and threonine; lysine and arginine; and phenylalanine and 
tyrosine. 

Where a particular polypeptide is said to have a specific percent identity to 
a reference polypeptide of a defined length, the percent identity is relative to the 
reference polypeptide. Thus, a polypeptide that is 50% identical to a reference 

20 polypeptide that is 100 amino acids long can be a 50 amino acid polypeptide that is 
completely identical to a 50 amino acid long portion of the reference polypeptide. 
It also might be a 100 amino acid long polypeptide which is 50% identical to the 
reference polypeptide over its entire length. Of course, other polypeptides also will 
meet the same criteria. 

25 The invention also features purified or isolated antibodies that specifically 

bind to a GEP polypeptide. By "specifically binds" is meant that an antibody 
recognizes and binds to a particular antigen, e.g., a GEP polypeptide, but does not 
substantially recognize and bind to other molecules in a sample, e.g., a biological 
sample that naturally includes a GEP polypeptide. 
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In another aspect, the invention features a method for detecting a GEP 
polypeptide in a sample. This method includes: obtaining a sample suspected of 
containing a GEP polypeptide; contacting the sample with an antibody that 
specifically binds to a GEP polypeptide under conditions that allow the formation 
5 of complexes of an antibody and the GEP polypeptide; and detecting the 

complexes, if any, as an indication of the presence of a GEP polypeptide in the 
sample. 

Also encompassed by the invention is a method of obtaining a gene related 
to (i.e., a functional homolog or ortholog of) a GEP gene. Such a method entails 

10 obtaining a labeled probe that includes an isolated nucleic acid which encodes all 
or a portion of a GEP nucleic acid, or a homolog or ortholog thereof; screening a 
nucleic acid fragment library with the labeled probe under conditions that allow 
hybridization of the probe to nucleic acid fragments in the library, thereby forming 
nucleic acid duplexes; isolating labeled duplexes, if any; and preparing a full-length 

15 gene sequence from the nucleic acid fragments in any labeled duplex to obtain a 
gene related to the GEP gene. 

The invention offers several advantages. For example, the methods for 
identifying antibacterial agents can be configured for high throughput screening of 
numerous candidate antibacterial agents. 

20 Unless otherwise defined, all technical and scientific terms used herein have 

the same meaning as conmionly understood by one of ordinary skill in the art to 
which this invention belongs. Although methods and materials similar or 
equivalent to those described herein can be used in the practice or testing of the 
present invention, suitable methods and materials are described herein. All 

25 publications, patent ^plications, patents, and other references mentioned herein are 
incorporated herein by reference in their entirety. In the case of a conflict, the 
present specification, including definitions, will control. In addition, the materials, 
methods, and examples are illustrative and are not intended to limit the scope of 
the invention, which is defined by the claims. 
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Other features and advantages of the invention v^ill be apparent from the 
following detailed description, and from the claims. 

Brief Description of the Drawings 
Fig. 1 is a representation of the amino acid and coding strand and non- 
5 coding strand nucleic acid sequences of the gepl03 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID NOs:l, 2, and 3 respectively). 

Fig. 2 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the geplll9 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID N0s:4, 5 and 6, respectively). 

10 Fig. 3 is a representation of the amino acid and coding strand and non- 

coding strand nucleic acid sequences of the gepll22 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID N0s:7, 8, and 9, respectively). 

Fig. 4 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gepl315 polypeptide and gene from a 
15 Streptococcus pneumonia strain (SEQ ID NOs:10, 11, and 12, respectively). 

Fig. 5 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gepl493 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID N0s:13, 14, and 15, respectively). 

Fig. 6 is a representation of the amino acid and coding strand and non- 
20 coding strand nucleic acid sequences of the gepl507 polypeptide and gene from a 
Streptococcus pneumonia (SEQ ID N0s:16, 17, and 18, respectively). 
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Fig. 7 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gepl51 1 polypeptide and gene from a 
Streptococcus pneumonia (SEQ ID N0s:19, 20, and 21, respectively). 

Fig. 8 is a representation of the amino acid and coding strand and non- 
5 coding strand nucleic acid sequences of the gepl518 polypeptide and gene from a 
Streptococcus pneumonia (SEQ ID NOs:22, 23, and 24, respectively). 

Fig. 9 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gepl546 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID NOs:25, 26, and 27, respectively). 

10 Fig. 10 is a representation of the amino acid and coding strand and non- 

coding strand nucleic acid sequences of the gepl551 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID NOs:28, 29, and 30, respectively). 

Fig. 11 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gepl561 polypeptide and gene from a 
15 Streptococcus pneumonia strain (SEQ ID N0s:31, 32, and 33, respectively). 

Fig. 12 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gepl580 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID NOs:34, 35, and 36, respectively). 

Fig. 13 is a representation of the amino acid and coding strand and non- 
20 coding strand nucleic acid sequences of the gepl713 polypeptide and gene from a 
Streptococcus pneumonia (SEQ ID NOs:37, 38, and 39, respectively). 
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Fig. 14 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gep222 polypeptide and gene from a 
Streptococcus pneumonia (SEQ ID NOs:40, 41, and 42, respectively). 

Fig. 15 is a representation of the amino acid and coding strand and non- 
5 coding strand nucleic acid sequences of the gep2283 polypeptide and gene from a 
Streptococcus pneumonia (SEQ ID NOs:43, 44, and 45, respectively). 

Fig. 16 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gep273 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID NOs:46, 47, and 48, respectively). 

10 Fig. 17 is a representation of the amino acid and coding strand and non- 

coding strand nucleic acid sequences of the gep286 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID NOs:49, 50, and 51, respectively). 

Fig. 18 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gep311 polypeptide and gene from a 
15 Streptococcus pneumonia (SEQ ID NOs:52, 53, and 54, respectively). 

Fig. 19 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gep3262 polypeptide and gene from a 
Streptococcus pneumonia (SEQ ID NOs:55, 56, and 57, respectively). 

Fig. 20 is a representation of the amino acid and coding strand and non- 
20 coding strand nucleic acid sequences of the gep3387 polypeptide and gene from a 
Streptococcus pneumonia (SEQ ID NOs:58, 59, and 60, respectively). 
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Fig. 21 are a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gep47 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID N0s:61, 62, and 63, respectively). 

Fig. 22 is a representation of the amino acid and coding strand and non- 
5 coding strand nucleic acid sequences of the gep61 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID NOs:64, 65, and 66, respectively). 

Fig. 23 is a representation of the amino acid and coding strand and non- 
coding strand nucleic acid sequences of the gep76 polypeptide and gene from a 
Streptococcus pneumonia strain (SEQ ID NOs:67, 68, and 69, respectively). 

10 Fig. 24 is a representation of the amino acid and coding strand and non- 

coding strand nucleic acid sequences of the B-yneS polypeptide and gene from a 
Bacillus subtilis strain (SEQ ID NOs:70, 71, and 72, respectively). 

Fig. 25 is a schematic representation of the PGR strategy used to produce 
DNA molecules used for targeted deletions of essential genes in Streptococcus 
15 pneumoniae. 

Fig. 26 is a schematic representation of the strategy used to produce 
targeted deletions of essential genes in Streptococcus pneumoniae. 

Detailed Description of the Invention 
Identifying Streptococcus Genes in Essential Operons 
20 As shown by the experiments described below, each of the GEP genes is 

located within an operon that is essential for survival of Streptococcus pneumonia 
Streptococcus pneumonia is available from the ATCC. To identify genes located 
within essential operons, mutants of Streptococcus pneumonia were produced. In 
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general, mutagenesis of Streptococcus pneumonia can be accomplished using any of 
various art-known methods. 

In general, and for the examples set forth below, genes located within 
essential Streptococcus pneumonia operons can be identified using genes from a 
5 Streptococcus pneumonia RXl genomic library, which was produced using standard 
methods (see Kim et al., Nucl. Acids. Res. 20: 1083-1085 (1992) and Ausubel et 
al. (eds.), 1995, Current Protocols in Molecular Biology, (John Wiley & Sons, 
NY)). Genes in this Streptococcus library were disrupted using a shuttle 
mutagenesis approach with the transposon TnPho-A. Each disrupted gene then was 
10 tested to determine whether it was located within an operon that is essential for 
survival of Streptococcus pneumonia. In this method, 2 ml of LB broth 
supplemented with chloramphenicol (10 )ig/ml), MgS04 (10 mM) and maltose 
(0.2%) were inoculated with 50 |il of the Streptococcus pneumonia RX-1 plasmid 
library. The culture was grovm at 37*^C while shaking until the OD^jq of the 
15 culture reached 0.8 (approximately 2 hours). A 1 ml aliquot of TnPho-A- 
containing phage (10^ pfu/ml) was added to I ml of the Streptococcus culture, 
producing a ratio of approximately 10 phage to 1 cell. The phage and cells were 
incubated at for 30 minutes. A 4 ml aliquot of LB broth, warmed to 37°C, 
then was added to the phage/cell mixture, and the mixture was incubated at 37°C, 
20 while shaking, for 1 hour. The cells then were pelleted by centrifuging them at 
3500 rpm in a Beckman tabletop centrifuge for 5 minutes. 

The pelleted cells then were resuspended in 800 ^1 of LB broth, and a 200 
/il aliquot of cells was plated onto each of four petri plates containing LB agar 
supplemented with chloramphenicol (10 /ig/ml), kanamycin (50 /xg/ml), and 
25 erythromycin (300 /xg/ml). The plates then were incubated overnight at 37°C, and 
the number of colonies appearing on the plates was counted. Approximately 
18,000 colonies then were pooled and used to inoculate 50 ml of LB broth, which 
was incubated overnight at 37°C. Plasmid DMA from the culture then was 
extracted using a Qiagen MED! Prep Kit; other art-known extraction methods can 
30 be substituted. 
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The concentration of the extracted DNA was measured, and 100 ng of the 
DNA was transformed, by electroporation, into E. coli DHIOB cells (Gibco BRL). 
A 1 ml aliquot of SOC broth then was added the transformed cells, and the cells 
were incubated at 37°C for 1 hour before being pelleted by centrifugation at 3500 
RPM for 5 minutes. The cells then were resuspended in 200 ^1 of LB broth, and 
aliquots of 2, 20, and 50 /zl were plated onto petri plates containing LB agar and 
antibiotics as described above. After incubating the plates overnight at 37°C, 93 
colonies were picked and used, individually, to inoculate 1.25 ml of Terrific broth 
supplemented with chloramphenicol (lO/xg/ml), kanamycin (50/ig/ml), and 
erythromycin (300/ig/ml). The cultures were incubated at 37°G for approximately 
20 hours, while shaking. The DNA from each culture then was extracted, using a 
conventional alkaline lysis miniprep method. 

The extracted DNA samples then were used, individually, to transform 
Streptococcus pneumonia cells in a 96-well microtitre format. The transposon 
promotes insertion of the mutagenized gene into the bacterial chromosome. Non- 
transforming clones indicate that the mutation was within an operon containing an 
essential gene. 

The non-transforming clones then were grown in 50 ml of Terrific broth 
supplemented with chloramphenicol (10 /ig/ml), kanamycin (50 /ig/ml), and 
erythromycin (300 /ig/ml). DNA from these clones was extracted and 
retransformed into Streptococcus pneumonia and plated on petri dishes to confirm 
that they were non-transforming. The genes located within essential operons then 
were sequenced, using primers that hybridize to sequences of the transposon. The 
sequences of the primers were: 5'GCAGCCCGGTTTTCCAGAACAGG3' (SEQ ID 
NO: 73) and 5'GATTTAGCCCAGTCGGCCGCACG3' (SEQ ID NO: 74). 

In an alternative method, which also was used, the transposon Tn 10 was 
used to disrupt genes in a Streptococcus pneumonia fosmid library, which was 
produced using standard methods. A 50 ml aliquot of TBMM broth supplemented 
with chloramphenicol (10/ig/ml), MgS04 (10 mM), and maltose (0.2%) were 
inoculated with a single fosmid colony from the fosmid library, and the cultures 
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were grown overnight at 37*^0. The cells then were pelleted and resuspended in 5 
ml of LB broth supplemented with chloramphenicol (10 ^g/m\), MgS04 (10 mM), 
and maltose (0.2%). A 100 ^1 aliquot of the cells then was mixed with 100 ^1 of 
TnlO phage lysate (10'° pfu/ml), and the mixture was incubated at room 

5 temperature for 15 minutes and then incubated at 37°C for 15 minutes. 

A 5 ml aliquot of LB broth supplemented with IPTG (1 mM) and sodium 
citrate (50 mM) and warmed to 37^*0 then was added to the cell/phage mixture. 
After incubating the cell/phage mixture at 37**C, while shaking, the cells were 
pelleted and resuspended in 800 fi\ of LB broth. The cells then were plated onto 4 

10 plates of LB agar supplemented with chloramphenicol (10 /xg/ml) and erythromycin 
(300 Mg/ml). After incubating the cells overnight at 37''C, at least 10,000 of the 
resulting colonies were used to inoculate 50 ml of LB broth. DNA then was 
extracted and quantified using standard methods, and 100 ng of DNA were used to 
transform E, colt DHIOB cells (Gibco BRL) via electroporation. After adding 1 ml 

15 of SOC broth to the cells, the cells were incubated at 3TC for 1 hour. The cells 
then were pelleted and suspended m 200 ^1 LB broth, and aliquots of 2, 20, and 50 
fi\ were plated onto LB agar supplemented with chloramphenicol (10 /ig/ml), 
kanamycin (50 /ig/ml), and erythromycin (300 fig/m\). The plates then were 
incubated overnight at 37°C, and 93 colonies were picked and used to inoculate 

20 1.25 ml of Terrific broth supplemented with chloramphenicol (10/xg/ml), 
kanamycin (50 /xg/ml) and erythromycin (300 /zg/ml). These cultures were 
incubated for approximately 20 hours, while shaking, and the DNA was isolated 
using a standard miniprep method. The extracted DNA then was used to transform 
Streptococcus pneumonia, and the genes located within essential operons were 

25 sequenced as described above. The sequences of the primers used for sequencing 
were: 5'CCGCCATTCTTTGCTGTTTCG3' (SEQ ID NO: 75) and 
5*TTACACGTTACTAAAGGGAATG3' (SEQ ID NO: 76). 
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Identification of the gepl493, gepl507, gepl546, gep273. gep286, and gep76 
Genes as Essential Genes 

As shown by the experiments described below, the gepl493, gepl507, 

gepl546, gep273, gep286, and gep76 genes each have been shown to be essential 

5 for survival of Streptococcus pneumoniae. Each of the gepl493, gepl507, 
gepl546, gep273, gep286, and gep76 genes has been identified as essential by 
creating a targeted deletion of each gene, separately, in Streptococcus pneumoniae. 
Each of the gepl493, gepl507, gepl546, gep273, gep286, and gep76 genes 
was, separately, replaced with a nucleic acid sequence conferring resistance to the 

10 antibiotic erythromycin (an "erm" gene). Other genetic markers can be used in lieu 
of this particular antibiotic resistance marker. Polymerase chain reaction (PCR) 
amplification was used to make a targeted deletion in the Streptococcus genomic 
DNA, as shown in Fig. 25. Several PCR reactions were used to produce the DNA 
molecules needed to carry out target deletion of the genes of interest. First, using 

15 primers 5 and 6, an erm gene was amplified from pIL252 from 5. subtilis 
(available fi'om the Bacillus Genetic Stock Center, Columbus, OH). Primer 5 
consists of 21 nucleotides that are identical to the promoter region of the erm gene 
and complementary to Sequence A. Primer 5 has the sequence 5'GTG TTC GTG 
CTG ACT TGC ACC3' (SEQ ID NO: 77). Primer 6 consists of 21 nucleotides 

20 that are complementary to the 3' end of the erm gene. Primer 6 has the sequence 
5'GAA TTA TTT CCT CCC GTT AAA3' (SEQ ID NO: 78). PCR ampHfication 
of the erm gene was carried out under the following conditions: 30 cycles of 94°C 
for 1 minute, 55®C for 1 minute, and IT'C for 1.5 minutes, followed by one cycle 
of 72°C for 10 minutes. 

25 In the second and third PCR reactions, sequences flanking the gene of 

interest were amplified and produced as hybrid DNA molecules that also contained 
a portion of the erm gene. The second reaction produced a double-stranded DNA 
molecule (termed "Left Flanking Molecule") that includes sequences upstream of 
the 5' end of the gene of interest and the first 21 nucleotides of the erm gene. As 

30 shown in Fig. 25, this reaction utilized primer 1, which is 21 nucleotides in length 
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and identical to a sequence that is located approximately 500 bp upstream of the 
translation start site of the gene of interest. Primers 1 and 2 are gene-specific and 
include the sequences 5'CTC CGT GAA GTC CAC CTG AT3' (SEQ ID NO:79) 
and 5'GGT GCA AGT GAG CAC GAA CAC GCG ACA TAG GTT CCA GTT 
5 AGG3' (SEQ ID NO:80), respectively, for gepl493. Primer 2 is 42 nucleotides in 
length, with 21 of the nucleotides at the 3' end of the primer being complementary 
to the 5' end of the sense strand of the gene of interest. The 21 nucleotides at the 
5' end of the primer were identical to Sequence A and are therefore complementary 
to the 5' end of the erm gene. Thus, PGR amplification using primers 1 and 2 
10 produced the left flanking DNA molecule, which is a hybrid DNA molecule 

containing a sequence located upstream of the gene of interest and 21 base pairs of 
the erm gene, as shown in Fig. 25. 

The third PCR reaction was similar to the second reaction, but produced the 
right flanking DNA molecule, shown in Fig. 25. The right flanking DNA molecule 
15 contains 21 base pairs of the 3' end of the erm gene, a 21 base pair portion of the 
3' end of the gene of interest, and sequences downstream of the gene of interest. 
This right flanking DNA molecule was produced with gene-specific primers 3 and 
4. For gep 1493, primers 3 and 4 included the sequences 5'TTT AAC GGG AGG 
AAA TAA TTC CCA TAT CGT GGC TCC TGA AT 3' (SEQ ID NO:81) and 
20 5*TAA AGC CCT CAT GTC GAA CC3' (SEQ ID NO:82), respectively. Primer 3 
is 42 nucleotides; the 21 nucleotides at the 5' end of Primer 3 are identical to 
Sequence B and therefore are identical to the 3' end of the erm gene. The 21 
nucleotides at the 3' end of Primer 3 are identical to the 3' end of the gene of 
interest. Primer 4 is 21 nucleotides in length and is complementary to a sequence 
25 located approximately 500 bp downstream of the gene of interest. As discussed 
above, primers 1-4 are gene- specific, and the sequences disclosed above were used 
for gep 1493. Gene-specific primers were used to identify the other essential genes 
described herein, as shown in Table 2. 
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TABLE 2: Primers Used in Identifying Essential Genes 



Gene 


Primer 1 


Primer 2 


Primer 3 


Primer 4 


gepl493 


5'CTCCGTGAA 
GTCCACCTGA 
T3* (SEQ ID 
NO:79) 


5'GGTGCAAGT 

CAGCACGAAC 

ACTGCTCGCG 

TAGATTGATT 

TG3' (SEQ ID 

NO:80) 


S'TTTAACGGG 

AGGAAATAAT 

TCGGGGATTG 

AACCTAACCC 

AT3' (SEQ ID 

N0:81) 


5'TTGGCAAG 
AAGGCAGAG 
AAT3' (SEQ ID 
NO:82) 


gepl507 


5'GCATGAGAA 
ACCCAGTCTC 
C3' (SEQ ID 
NO:83) 


5'GGTGCAAGT 
CAGCACGAAC 
ACGCGACATA 
GGTTCCAGTT 
AGG3' (SEQ ID 
NO:84) 


5'TTTAACGGG 

AGGAAATAAT 

TCCCATATCG 

TGGCTCCTGA 

AT3' (SEQ ID 

NO:85) 


5TAAAGCCC 
TCATGTCGAA 
CC3* (SEQ ID 
NO:86) 


gepi546 


5'CAGTGACGA 
TACAGATGAA 
GAA3' (SEQ ID 
NO:87) 


5'GGTGCAAGT 

CAGCACGAAC 

ACGATGCTGG 

CTTCGTTGAG 

TG3' (SEQ ID 

NO:88) 


5'TTTAACGGG 

AGGAAATAAT 

TCGTCGCGAC 

TCCTAGCCAT 

AC3* (SEQ ID 

NO:89) 


5XCAGCAAA 
GGAAAACCG 
ATA3' (SEQ ID 
NO:90) 


gep273 


5'GGTCAGTGA 
CAGCAGCAGA 
T3' (SEQ ID 
N0:91) 


5'GGTGCAAGT 

CAGCACGAAC 

ACGGCCTTGG 

AAAAAAGACC 

AT3' (SEQ ID 

NO:92) 


5'TTTAACGGG 

AGGAAATAAT 

TCCCGCTTAA 

ATTCTGCCAA 

TC3' (SEQ ID 

NO:93) 


5'CCCATAAC 
CGTATCACCT 
GG3' (SEQ ID 
NO:94) 


gep286 


5*CGGAACGGC 
TATGAAAAAA 
A3' (SEQ ID 
NO:95) 


5'GGTGCAAGT 

CAGCACGAAC 

ACACGACGAA 

AGGCAACCAT 

AC3' (SEQ ID 

NO:96) 


5'TTTAACGGG 

AGGAAATAAT 

TCTGGTATGG 

GGGTTGATGA 

AG3' (SEQ ID 

NO:97) 


5TCGCCCTAC 
TTTTCGTATG 
C3' (SEQ ID 
NO:98) 


gep76 


5'AGCGATATT 
AGTGCGGGAG 
A3' (SEQ ID 
NO:99) 


5'GGTGCAAGT 
CAGCACGAAC 
ACCAGCAATT 
TTGTCATCAG 
TCG3' (SEQ ID 
NO: 100) 


5'TTTAACGGG 

AGGAAATAAT 

TCCTGGGGTA 

ATGGAGCACA 

GT3' (SEQ ID 

NO:101) 


5'GGGATTGT 
CACGGTAAA 
ACC3' (SEQ ID 
NO: 102) 



SUBSTITUTE SHEET (RULE 26) 
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PCR amplification of the left and right flanking DNA molecules was carried 
out, separately, in 50 ^1 reaction mixtures containing: 1 ^1 Streptococcus 
pneumoniae (RXl) DNA (0.25 /ig), 2.5 ii\ Primer 1 or Primer 4 (10 pmol/jLtl), 
2.5 /xl Primer 2 or Primer 3 (20 pmol//xl), 1.2 /il a mixture dNTPS (10 mM each), 
5 37 ftl HjO, 0.7 III Taq polymerase (5 U/^1), and 5 ftl lOx Taq polymerase buffer 
(10 mM Tris. 50 mM KCl, 2.5 mM MgCy. The left and right flanking DNA 
molecules were amplified using the following PCR cycling program: 95°C for 2 
minutes; 72^C for 1 minute; 94*^0 for 30 seconds; 49^C for 30 seconds; 72T for 1 
minute; repeating the 94^*0, 49''C, and IT'C incubations 30 times; 72^C for 10 
10 minutes and then stopping the reactions. A 15 ;tl aliquot of each reaction mixture 
then was electrophoresed through a 1 .2% low melting point agarose gel in TAE 
buffer and then stained with ethidium bromide. Fragments containing the amplified 
left and right flanking DNA molecules were excised from the gel and purified 
using the QIAQUICK™ gel extraction kit (Qiagen, Inc.) Other art-known methods 
15 for amplifying and isolating DNA can be substituted. The flanking left and right 
DNA fi-agments were eluted into 30 TE buffer at pH 8.0. 

The amplified erm gene and left and right flanking DNA molecules were 
then fiised together to produce the fiision product, as shown in Fig. 25. The fusion 
PCR reaction was carried out in a volume of 50 /xl containing: 2 /xl of each of the 
20 left and right flanking DNA molecules and the erm gene PCR product; 5 ii\ of lOx 
buffer; 2.5 p\ of Primer 1 (10 pmoV/il); 2.5 /xl of Primer 4 (10 pmol//xl), 1.2 /xl 
dNTP mix (10 mM each) 32 /xl HjO, and 0.7 /tl Taq polymerase. The PCR 
reaction was carried out using the following cycling program: 95°C for 2 minutes; 
72*'C for 1 minute; 94*'C for 30 seconds, 48**C for 30 seconds; 72°C for 3 minutes; 
25 repeat the 94'*C, 48'*C and 72**C incubations 25 times; 72°C for 10 minutes. After 
the reaction was stopped, a 12 /xl aliquot of the reaction mixture was 
electrophoresed through an agarose gel to confirm the presence of a final product 
of approximately 2 kb. 

A 5 /xl aliquot of the fusion product was used to transform S. pneumoniae 
30 grown on a medium containing erythromycin in accordance with standard 
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techniques. As shown in Fig. 26, the fusion product and the S. pneumoniae 
genome undergo a homologous recombination event so that the erm gene replaces 
the chromosomal copy of the gene of interest, thereby creating a gene knockout. 
Disruption of an essential gene results in no growth on a medium containing 
5 erythromycin. Using this gene knockout method, the gepl493, gepl507, gepl546, 
gep273, gep286, and gep76 genes were each identified as being essential for 
survival. 
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Identification of Homologs and Ortholoes of GEP Polypeptides 

Having shown that the various GEP genes are essential or located within 
operons that are essential for survival of Streptococcus, it can be expected that 
homologs and orthologs of the polypeptides encoded by these genes, when present 

5 in other organisms, for example B. subtiliSy are essential or located within operons 
that are essential for survival of that organism as well, and therefore are useful 
targets for identifying antibacterial agents. Using the sequences of the GEP 
polypeptides identified in Streptococcus, homologs and orthologs of these 
polypeptides can be identified in other organisms. For example, the coding 

1 0 sequences of the GEP nucleic acids can be used to search the GenBank database of 
nucleotide sequences to identify homologs or orthologs that are expressed from 
essential operons in other organisms. Sequence comparisons can be performed 
using the Basic Local Alignment Search Tool (BLAST) (Altschul et al., 7. MoL 
Biol, 215:403-410 1990). The percent sequence identity shared by the GEP 

15 polypeptides and their homologs or orthologs can be determined using the GAP 
program from the Genetics Computer Group (GCG) Wisconsin Sequence Analysis 
Package (Wisconsin Package Version 9.0, GCG; Madison, WI). The following 
parameters are suitable: gap creation penalty, 12 (protein) 50 (DNA); gap 
extension penalty, 4 (protein) 3 (DNA). Typically, the GEP polypeptides and their 

20 homologs share at least 25% (e.g., at least 40%) sequence identity. Typically, the 
DNA sequences encoding GEP polypeptides and their homologs share at least 35% 
(e.g., at least 45%) sequence identity. To confirm that the homologs or orthologs 
of the GEP polypeptides are expressed from operons that are essential for survival 
of bacteria, the operon encoding each of the homologs or orthologs can be, 

25 separately, deleted from the genome of the host organism. 

Identification of Essential Operons in Additional Streptococcus Strains 

Now that the various GEP genes have been identified as being located 
within operons that are essential for survival, these genes, or fragments thereof, can 
be used to detect homologous or orthologous genes in other organisms. In 
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particular, these genes can be used to analyze various pathogenic and non- 
pathogenic strains of bacteria. Fragments of a nucleic acid (DNA or RNA) 
encoding a GEP polypeptide or homolog or ortholog (or sequences complementary 
thereto) can be used as probes in conventional nucleic acid hybridization assays of 
5 pathogenic bacteria. For example, nucleic acid probes (which typically are 8-30, or 
usually 15-20, nucleotides in length) can be used to detect GEP genes or homologs 
or orthologs thereof in art-known molecular biology methods, such as Southern 
blotting, Northern blotting, dot or slot blotting, PGR amplification methods, colony 
hybridization methods, and the like. Typically, an oligonucleotide probe based on 
10 the nucleic acid sequences described herein, or fragments thereof, is labeled and 
used to screen a genomic library constructed from mRNA obtained from a 
Streptococcus or bacterial strain of interest. A suitable method of labeling involves 
using polynucleotide kinase to add ^^P-labeled ATP to the oligonucleotide used as 
the probe. This method is well known in the art, as are several other suitable 
15 methods (e.g., biotinylation and enzyme labeling). 

Hybridization of the oligonucleotide probe to the library, or other nucleic 
acid sample, typically is performed under stringent to highly stringent conditions. 
Nucleic acid duplex or hybrid stability is expressed as the melting temperature or 
Tn,, which is the temperature at which a probe dissociates from a target DNA. This 
20 melting temperature is used to define the required stringency conditions. If 
sequences are to be identified that are related and substantially identical to the 
probe, rather than identical, then it is useful to first establish the lowest temperature 
at which only homologous hybridization occurs with a particular concentration of 
salt (e.g., SSC or SSPE). Then, assuming that 1% mismatching results in a 1°C 
25 decrease in the T„, the temperature of the final wash in the hybridization reaction 
is reduced accordingly (for example, if sequences having > 95% identity with the 
probe are sought, the final wash temperature is decreased by 5**C). In practice, the 
change in T^, can be between 0.5*" and 1.5°C per 1% mismatch. 

As used herein, highly stringent conditions refer to hybridization at 68**C in 
30 5x SSaSx Denhardt's solution/1,0% SDS, and washing in 0.2x SSaO.1% SDS at 



wo 99/33871 



PCT/US98/27918 



-30- 

42°C. Stringent conditions refer to washing in 3x SSC at 42**C. The parameters of 
salt concentration and temperature can be varied to achieve the optimal level of 
identity between the probe and the target nucleic acid. Additional guidance 
regarding such conditions is readily available in the art, for example, by Sambrook 
5 et aL, 1989, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, 
N.Y.; and Ausubel et al. (eds.), 1995, Current Protocols in Molecular Biology, 
(John Wiley & Sons, N.Y.) at Unit 2.10. 

In one approach, libraries constructed from pathogenic or non-pathogenic 
Streptococcus or bacterial strains can be screened. For example, such strains can 
10 be screened for expression of GEP genes by Northern blot analysis. Upon 
detection of transcripts of the GEP genes or homologs or orthologs thereof, 
libraries can be constructed from RNA isolated from the appropriate strain, utilizing 
standard techniques well known to those of skill in the art. Alternatively, a total 
genomic DNA library can be screened using an GEP gene probe (or a probe 
15 directed to a homolog or ortholog thereof). 

New gene sequences can be isolated, for example, by performing PCR using 
two degenerate oligonucleotide primer pools designed on the basis of nucleotide 
sequences within the GEP genes, or their homologs or orthologs, as depicted 
herein. The template for the reaction can be DNA obtained from strains known or 
20 suspected to express a GEP allele or an allele of a homolog or ortholog thereof. 
The PCR product can be subcloned and sequenced to ensure that the amplified 
sequences represent the sequences of a new GEP nucleic acid sequence, or a 
sequence of a homolog or ortholog thereof. 

Synthesis of the various GEP polypeptides or their homologs or orthologs 
25 (or an antigenic fragment thereof) for use as antigens, or for other purposes, can 
readily be accomplished using any of the various art-known techniques. For 
example, a polypeptide or homolog or ortholog thereof, or an antigenic fragment(s), 
can be synthesized chemically in vitro, or enzymatically (e.g., by in vitro 
transcription and translation). Alternatively, the gene can be expressed in, and the 
30 polypeptide purified from, a cell (e.g., a cultured cell) by using any of the 
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numerous, available gene expression systems. For example, the polypeptide antigen 
can be produced in a prokaryotic host (e.g., E. coli or B. subtilis) or in eukaryotic 
cells, such as yeast cells or insect cells (e.g., by using a baculovirus-based 
expression vector). 

5 Proteins and polypeptides can also be produced in plant cells, if desired. 

For plant cells viral expression vectors (e.g., cauliflower mosaic virus and tobacco 
mosaic virus) and plasmid expression vectors (e.g., Ti plasmid) are suitable. Such 
cells are available from a wide range of sources (e.g., the American Type Culture 
Collection, Rockland, MD; also, see, e.g., Ausubel et al., Current Protocols in 

10 Molecular Biology, John Wiley & Sons, New York, 1994). The optimal methods 
of transformation or transfection and the choice of expression vehicle will depend 
on the host system selected. Transformation and transfection methods are 
described, e.g., in Ausubel et al., supra : expression vehicles may be chosen from 
those provided, e.g., in Cloning Vectors: A Laboratory Manual (P.H. Pouwels et 

15 al., 1985, Supp. 1987). The host cells harboring the expression vehicle can be 
cultured in conventional nutrient media, adapted as needed for activation of a 
chosen gene, repression of a chosen gene, selection of transformants, or 
amplification of a chosen gene. 

If desired, GEP polypeptides or their homologs or orthologs can be 

20 produced as fusion proteins. For example, the expression vector pUR278 (Ruther 
et al, EMBO J., 2:1791, 1983) can be used to create lacZ fusion proteins. The art- 
known pGEX vectors can be used to express foreign polypeptides as fusion 
proteins with glutathione S-transferase (GST). In general, such fusion proteins are 
soluble and can be easily purified from lysed cells by adsorption to glutathione- 

25 agarose beads followed by elution in the presence of free glutathione. The pGEX 
vectors are designed to include thrombin or factor Xa protease cleavage sites so 
that the cloned target gene product can be released from the GST moiety. 

In an exemplary insect cell expression system, a baculovirus such as 
Autographa califomica nuclear polyhedrosis virus (AcNPV), which grows in 

30 Spodoptera frugiperda cells, can be used as a vector to express foreign genes. A 
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coding sequence encoding a GEP polypeptide or homolog or ortholog can be 
cloned into a non-essential region (for example the polyhedrin gene) of the viral 
genome and placed under control of a promoter, e.g., the polyhedrin promoter or 
an exogenous promoter. Successful insertion of a gene encoding a GEP 
5 polypeptide or homolog or ortholog can result in inactivation of the polyhedrin 
gene and production of non-occluded recombinant virus (i.e., virus lacking the 
proteinaceous coat encoded by the polyhedrin gene). These recombinant viruses 
are then used to infect insect cells (e.g., Spodoptera frugiperda cells) in which the 
inserted gene is expressed (see, e.g., Smith et al., J. Virol,, 46:584, 1983; Smith, 
10 U.S. Patent No. 4,215,051). 

In mammalian host cells, a number of viral-based expression systems can be 
utilized. When an adenovirus is used as an expression vector, the nucleic acid 
sequence encoding the GEP polypeptide or homolog or ortholog can be ligated to 
an adenovirus transcription/ translation control complex, e.g., the late promoter and 
15 tripartite leader sequence. This chimeric gene can then be inserted into the 
adenovirus genome by in vitro or in vivo recombination. Insertion into a non- 
essential region of the viral genome (e.g., region El or E3) will result in a 
recombinant virus that is viable and capable of expressing a essential gene product 
in infected hosts ( see, e.g., Logan, Proc. Natl. Acad. Sci. USA, 81:3655, 1984). 
20 Specific initiation signals may be required for efficient translation of 

inserted nucleic acid sequences. These signals include the ATG initiation codon 
and adjacent sequences. In general, exogenous translational control signals, 
including, perhaps, the ATG initiation codon, should be provided. Furthermore, the 
initiation codon must be in phase with the reading frame of the desired coding 
25 sequence to ensure translation of the entire sequence. These exogenous 

translational control signals and initiation codons can be of a variety of origins, 
both natural and synthetic. The efficiency of expression may be enhanced by the 
inclusion of appropriate transcription enhancer elements, or transcription 
terminators (Bittner et al.. Methods in Enzymol., 153:516, 1987). 
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The GEP polypeptides and homologs and orthologs can be expressed 
individually or as fusions with a heterologous polypeptide, such as a signal 
sequence or other polypeptide having a specific cleavage site at the N-and/or C- 
terminus of the protein or polypeptide. The heterologous signal sequence selected 

5 should be one that is recognized and processed, i.e., cleaved by a signal peptidase, 
by the host cell in which the fusion protein is expressed. 

A host cell can be chosen that modulates the expression of the inserted 
sequences, or modifies and processes the gene product in a specific, desired 
fashion. Such modifications and processing (e.g., cleavage) of protein products 

10 may faciUtate optimal functioning of the protein. Various host cells have 
characteristic and specific mechanisms for post-translational processing and 
modification of proteins and gene products. Appropriate cell hnes or host systems 
familiar to those of skill in the art of molecular biology can be chosen to ensure 
the correct modification and processing of the foreign protein expressed. To this 

15 end, eukaryotic host cells that possess the cellular machinery for proper processing 
of the primary transcript, and phosphorylation of the gene product can be used. 
Such mammalian host cells include, but are not limited to, CHO, VERO, BHK, 
HeLa, COS, MDCK, 293, 3T3, WI38, and choroid plexus cell lines. 

If desired, the GEP polypeptide or homolog or ortholog thereof can be 

20 produced by a stably-transfected mammalian cell line. A number of vectors 

suitable for stable transection of mammalian cells are available to the public, see, 
e.g., Pouwels et al. (supra) : methods for constructing such cell lines are also 
publicly known, e.g., in Ausubel et al. ( supra) . In one example, DNA encoding the 
protein is cloned into an expression vector that includes the dihydrofolate reductase 

25 (DHFR) gene. Integration of the plasmid and, therefore, the GEP polypeptide- 
encoding gene into the host cell chromosome is selected for by including 0.01-300 
methotrexate in the cell culture medium (as described in Ausubel et al., supra) . 
This dominant selection can be accomplished in most cell types. 

Recombinant protein expression can be increased by DUFR-mediated 

30 amplification of the transfected gene. Methods for selecting cell lines bearing gene 
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amplifications are described in Ausubel et al. ( supra ); such methods generally 
involve extended culture in medium containing gradually increasing levels of 
methotrexate. DHFR-containing expression vectors commonly used for this 
purpose include pCVSEII-DHFR and pAdD26SV(A) (described in Ausubel et al., 
5 supra) . 

A number of other selection systems can be used, including but not limited 
to, herpes simplex virus thymidine kinase genes, hypoxanthine-guanine 
phosphoribosyl-transferase genes, and adenine phosphoribosyltransferase genes, 
which can be employed in thy hgprt, or aprt cells, respectively. In addition, gpt, 
10 which confers resistance to mycophenolic acid (Mulligan et al., Proc. Natl. Acad. 
Sci. USA, 78:2072, 1981); neo, which confers resistance to the aminoglycoside G- 
418 (Colberre-Garapin et al., 7. MoL Biol, 150:1, 1981); and hygro, which confers 
resistance to hygromycin (Santerre et al.. Gene, 30:147, 1981), can be used. 

Alternatively, any fusion protein can be readily purified by utilizing an 
15 antibody or other molecule that specifically binds to the fusion protein being 

expressed. For example, a system described in Janknecht et al., Proc, Nail Acad, 
Set USA, 88:8972 (1981), allows for the ready purification of non-denatured fusion 
proteins expressed in human cell lines. In this system, the gene of interest is 
subcloned into a vaccinia recombination plasmid such that the gene's open reading 
20 frame is translationally fused to an amino-terminal tag consisting of six histidine 
residues. Extracts from cells infected with recombinant vaccinia virus are loaded 
onto Ni^^ nitriloacetic acid-agarose columns, and histidine-tagged proteins are 
selectively eluted with imidazole-containing buffers. 

Alternatively, a GEP polypeptide or homolog or ortholog, or a portion 
25 thereof, can be fused to an immunoglobulin Fc domain. Such a fusion protein can 
be readily purified using a protein A column, for example. Moreover, such fusion 
proteins permit the production of a chimeric form of a GEP polypeptide or 
homolog or ortholog having increased stability in vivo. 

Once the recombinant GEP polypeptide (or homolog or ortholog) is 
30 expressed, it can be isolated (i.e., purified). Secreted forms of the polypeptides can 
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be isolated from cell culture media, while non-secreted forms must be isolated from 
the host cells. Polypeptides can be isolated by affinity chromatography. For 
example, an anti-gepl03 antibody (e.g., produced as described herein) can be 
attached to a column and used to isolate the protein. Lysis and fractionation of 

5 cells harboring the protein prior to affinity chromatography can be performed by 
standard methods (see, e.g., Ausubel et al., supra) . Alternatively, a fusion protein 
can be constructed and used to isolate a GEP polypeptide (e.g., a gepl03-maltose 
binding fusion protein, a gep-103-P-galactosidase fusion protein, or a gepl03-trpE 
fusion protein; see, e.g., Ausubel et al., supra : New England Biolabs Catalog, 

10 Beverly, MA). The recombinant protein can, if desired, be further purified, e.g., 
by high performance liquid chromatography using standard techniques {see, e.g., 
Fisher, Laboratory Techniques In Biochemistry And Molecular Biology, eds.. Work 
and Burdon, Elsevier, 1980). 

Given the amino acid sequences described herein, polypeptides useful in 

15 practicing the invention, particularly fragments of GEP polypeptides can be 

produced by standard chemical synthesis (e.g., by the methods described in Solid 
Phase Peptide Synthesis, 2nd ed.. The Pierce Chemical Co., Rockford, IL, 1984) 
and used as antigens, for example. 

Antibodies 

20 The GEP polypeptides (or antigenic fragments or analogs of such 

polypeptides) can be used to raise antibodies useful in the invention, and such 
polypeptides can be produced by recombinant or peptide synthetic techniques (see, 
e.g.. Solid Phase Peptide Synthesis, supra ; Ausubel et al., supra ). Likewise, 
antibodies can be raised against the GEP homologs and orthologs. In general, the 

25 polypeptides can be coupled to a carrier protein, such as KLH, as described in 
Ausubel et al, supra, mixed with an adjuvant, and injected into a host mammal. 
Antibodies can be purified, for example, by affinity chromatography methods in 
which the polypeptide antigen is immobilized on a resin. 
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In particular, various host animals can be immunized by injection of a 
polypeptide of interest. Examples of suitable host animals include rabbits, mice, 
guinea pigs, and rats. Various adjuvants can be used to increase the immunological 
response, depending on the host species, including but not limited to Freund's 

5 (complete and incomplete adjuvant), adjuvant mineral gels such as aluminum 
hydroxide, surface active substances such as lysolecithin, pluronic polyols, 
polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, dinitrophenol, 
BCG (bacille Calmette-Guerin) and Corynebacterium parvum. Polyclonal 
antibodies are heterogeneous populations of antibody molecules derived from the 

10 sera of the immunized animals. 

Antibodies useful in the invention include monoclonal antibodies, polyclonal 
antibodies, humanized or chimeric antibodies, single chain antibodies, Fab 
fragments, F(ab')2 fragments, and molecules produced using a Fab expression 
library. 

15 Monoclonal antibodies (mAbs), which are homogeneous populations of 

antibodies to a particular antigen, can be prepared using the GEP polypeptides or 
homologs or orthologs thereof and standard hybridoma technology (see, e.g., 
Kohler et al, Nature, 256:495, 1975; Kohler et al., Eur, J. Immunol., 6:511, 1976; 
Kohler et al., Eur, J, Immunol, 6:292, 1976; Hammerling et al. In Monoclonal 

20 Antibodies and T Cell Hvbridomas, Elsevier, NY, 1981; Ausubel et al., supra) . 

In particular, monoclonal antibodies can be obtained by any technique that 
provides for the production of antibody molecules by continuous cell lines in 
culture, such as those described in Kohler et al. Nature, 256:495, 1975, and U.S. 
Patent No. 4,376,110; the human B-cell hybridoma technique (Kosbor et al., 

25 Immunology Today, 4:72, 1983; Cole et al., Proc. Natl Acad. ScL USA, 80:2026, 
1983); and the EBV-hybridoma technique (Cole et al., Monoclonal Antibodies and 
Cancer Therapy. Alan R. Liss, Inc., pp. 77-96, 1983). Such antibodies can be of 
any inununoglobulin class including IgG, IgM, IgE, IgA, IgD, and any subclass 
thereof. The hybridomas producing the mAbs of this invention can be cultivated in 

30 vitro or in vivo. 
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Once produced, polyclonal or monoclonal antibodies are tested for specific 
recognition of a GEP polypeptide or homolog or ortholog thereof in an 
immunoassay, such as a Western blot or immunoprecipitation analysis using 
standard techniques, e.g., as described in Ausubel et al., supra . Antibodies that 

5 specifically bind to the GEP polypeptides, or conservative variants and homologs or 
orthologs thereof, are useful in the invention. For example, such antibodies can be 
used in an immunoassay to detect a GEP polypeptide in pathogenic or non- 
pathogenic strains of bacteria. 

Preferably, antibodies of the invention are produced using fi^agments of the 

10 GEP polypeptides that appear likely to be antigenic, by criteria such as high 
fi-equency of charged residues. In one specific example, such fragments are 
generated by standard techniques of PGR, and are then cloned into the pGEX 
expression vector (Ausubel et al., supra) . Fusion proteins are expressed in E, coli 
and purified using a glutathione agarose affinity matrix as described in Ausubel, et 

15 al., supra . 

If desired, several (e.g., two or three) fusions can be generated for each 
protein, and each fusion can be injected into at least two rabbits. Antisera can be 
raised by injections in a series, typically including at least three booster injections. 
Typically, the antisera is checked for its ability to immunoprecipitate a recombinant 

20 GEP polypeptide or homolog or ortholog, or unrelated control proteins, such as 
glucocorticoid receptor, chloramphenicol acetyltransferase, or luciferase. 

Techniques developed for the production of "chimeric antibodies" (Morrison 
et al., Proc, Natl Acad. Set, 81:6851, 1984; Neuberger et al.. Nature, 312:604, 
1984; Takeda et al., Nature, 314:452, 1984) can be used to splice the genes fi-om a 

25 mouse antibody molecule of appropriate antigen specificity together with genes 
firom a human antibody molecule of appropriate biological activity. A chimeric 
antibody is a molecule in which different portions are derived firom different animal 
species, such as those having a variable region derived firom a murine mAb and a 
human immunoglobulin constant region. 
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Alternatively, techniques described for the production of single chain 
antibodies (U.S. Patent 4,946,778; and U.S. Patents 4,946,778 and 4,704,692) can 
be adapted to produce single chain antibodies against a GEP polypeptide or 
homolog or ortholog. Single chain antibodies are formed by linking the heavy and 

5 light chain fragments of the Fv region via an amino acid bridge, resulting in a 
single chain polypeptide. 

Antibody fragments that recognize and bind to specific epitopes can be 
generated by known techniques. For example, such fragments can include but are 
not limited to F(ab')2 fragments, which can be produced by pepsin digestion of the 

10 antibody molecule, and Fab fragments, which can be generated by reducing the 
disulfide bridges of F(ab')2 fragments. Alternatively, Fab expression libraries can 
be constructed (Huse et al.. Science^ 246:1275, 1989) to allow rapid and easy 
identification of monoclonal Fab fragments with the desired specificity. 

Polyclonal and monoclonal antibodies that specifically bind to GEP 

15 polypeptides or homologs or orthologs can be used, for example, to detect 
expression of a GEP gene or homolog or ortholog in another strain of bacteria. 
For example, a GEP polypeptide can be readily detected in conventional 
immunoassays of bacteria cells or extracts. Examples of suitable assays include, 
without limitation. Western blotting, ELISAs, radioimmune assays, and the like. 

20 Assav for Antibacterial Agents 

The invention provides a method for identifying an antibacterial agent(s). 
Although the inventors are not bound by any particular theory as to the biological 
mechanism involved, the new antibacterial agents are thought to inhibit specifically 
(1) the function of a polypeptide(s) encoded by a nucleic acid located within an 

25 operon containing a GEP gene, or (2) expression of the a gene located within an 
operon containing a GEP gene, or homologs or orthologs thereof. Screening for 
antibacterial agents can be rapidly accomplished by identifying those compounds 
(e.g., polypeptides or small molecules) that specifically bind to a polypeptide 
encoded by a nucleic acid located within an operon containing a GEP gene. A 



wo 99/33871 



PCTAJS98/27918 



- 39 - 

homolog or ortholog of a GEP polypeptide can be substituted for the GEP 
polypeptide in the methods summarized herein. Specific binding of a test 
compound to a polypeptide can be detected, for example, in vitro by reversibly or 
irreversibly immobilizing the test compound(s) on a substrate, e.g., the surface of a 

5 well of a 96-well polystyrene microtitre plate. Methods for immobilizing 

polypeptides and other small molecules are well known in the art. For example, 
the microtitre plates can be coated with a polypeptide encoded by a nucleic acid 
located within an operon containing a GEP gene (e.g., a GEP polypeptide or a 
combination of GEP polypeptides and/or homologs and/or orthologs) by adding the 

10 polypeptide(s) in a solution (typically, at a concentration of 0.05 to 1 mg/ml in a 
volume of 1-100 to each well, and incubating the plates at room temperature to 
37®C for 0.1 to 36 hours. Polypeptides that are not bound to the plate can be 
removed by shaking the excess solution from the plate, and then washing the plate 
(once or repeatedly) with water or a buffer. Typically, the polypeptide, homolog, 

15 or ortholog is contained in water or a buffer. The plate is then washed with a 
buffer that lacks the bound polypeptide. To block the free protein-binding sites on 
the plates, the plates are blocked with a protein that is unrelated to the bound 
polypeptide. For example, 300 fi\ of bovine serum albumin (BSA) at a 
concentration of 2 mg/ml in Tris-HCl is suitable. Suitable substrates include those 

20 substrates that contain a defined cross-linking chemistry (e.g., plastic substrates, 
such as polystyrene, styrene, or polypropylene substrates from Coming Costar 
Corp. (Cambridge, MA), for example). If desired, a beaded particle, e.g., beaded 
agarose or beaded sepharose, can be used as the substrate. 

Binding of the test compound to the new polypeptides (or homologs or 

25 orthologs thereof) can be detected by any of a variety of art-known methods. For 
example, an antibody that specifically binds to a GEP polypeptide can be used in 
an immxmoassay. If desired, the antibody can be labeled (e.g., fluorescently or 
with a radioisotope) and detected directly {see, e.g.. West and McMahon, J. Cell 
Biol. 74:264, 1977). Alternatively, a second antibody can be used for detection 

30 (e.g., a labeled antibody that binds to the Fc portion of an anti-GEP103 antibody). 
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In an alternative detection method, the GEP polypeptide is labeled, and the label is 
detected (e.g., by labeling a GEP polypeptide with a radioisotope, fluorophore, 
chromophore, or the like). In still another method, the GEP polypeptide is 
produced as a fusion protein with a protein that can be detected optically, e.g., 
5 green fluorescent protein (which can be detected under UV light). In an alternative 
method, the polypeptide (e.g., gepl03) can be produced as a fusion protein with an 
enzyme having a detectable enzymatic activity, such as horse radish peroxidase, 
alkaline phosphatase, P-galactosidase, or glucose oxidase. Genes encoding all of 
these enzymes have been cloned and are readily available for use by those of skill 
10 in the art. If desired, the fusion protein can include an antigen, and such an 
antigen can be detected and measured with a polyclonal or monoclonal antibody 
using conventional methods. Suitable antigens include enzymes (e.g., horse radish 
peroxidase, alkaline phosphatase, and P-galactosidase) ^and non-enzymatic 
polypeptides (e.g., serum proteins, such as BSA and globulins, and milk proteins, 
15 such as caseins). 

In various in vivo methods for identifying polypeptides that bind to GEP 
polypeptides, the conventional two-hybrid assays of protein/protein interactions can 
be used (see e.g., Chien et al., Proc. Natl Acad, ScL USA, 88:9578, 1991; Fields et 
al., U.S. Pat. No. 5,283,173; Fields and Song, Nature, 340:245, 1989; Le Douarin 
20 et al.. Nucleic Acids Research, 23:876, 1995; Vidal et al. Proa Natl, Acad. Sci, 
USA, 93:10315-10320, 1996; and White, Proc. Natl. Acad. ScL USA, 93:10001- 
10003, 1996). Kits for practicing various two-hybrid methods are commercially 
available (e.g., from Clontech; Palo Alto, CA). 

Generally, the two-hybrid methods involve in vivo reconstitution of two 
25 separable domains of a transcription factor. The DNA binding domain (DB) of the 
transcription factor is required for recognition of a chosen promoter. The 
activation domain (AD) is required for contacting other components of the host 
cell's transcriptional machinery. The transcription factor is reconstituted through 
the use of hybrid proteins. One hybrid is composed of the AD and a first protein 
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of interest. The second hybrid is composed of the DB and a second protein of 
interest. 

Useful reporter genes are those that are operably linked to a promoter which 
is specifically recognized by the DB. Typically, the two-hybrid system employs 
5 the yeast Saccharomyces cerevisiae and reporter genes, the expression of which can 
be selected under appropriate conditions. Other eukaryotic cells, including 
mammalian and insect cells, can be used, if desired. The two-hybrid system 
provides a convenient method for cloning a gene encoding a polypeptide (i.e., a 
candidate antibacterial agent) that binds to a second, preselected polypeptide (e.g., 
10 gepl03). Typically, though not necessarily, a DNA library is constructed such that 
randomly generated sequences are fused to the AD, and the protein of interest (e.g., 
geplOS) is fused to the DB. 

In such two-hybrid methods, two fusion proteins are produced. One fusion 
protein contains the GEP polypeptide (or homolog or ortholog thereof) fused to 
15 either a transactivator domain or DNA binding domain of a transcription factor 
(e.g., of Gal4). The other fusion protein contains a test polypeptide fused to either 
the DNA binding domain or a transactivator domain of a transcription factor. Once 
brought together in a single cell (e.g., a yeast cell or mammalian cell), one of the 
fusion proteins contains the transactivator domain and the other fusion protein 
20 contains the DNA binding domain. Therefore, binding of the GEP polypeptide to 
the test polypeptide (i.e., candidate antibacterial agent) reconstitutes the 
transcription factor. Reconstitution of the transcription factor can be detected by 
detecting expression of a gene (i.e., a reporter gene) that is operably linked to a 
DNA sequence that is bound by the DNA binding domain of the transcription 
25 factor. 

The methods described above can be used for high throughput screening of 
numerous test compounds to identify candidate antibacterial (or anti-bacterial) 
agents. Having identified a test compound as a candidate antibacterial agent, the 
candidate antibacterial agent can be further tested for inhibition of bacterial growth 
30 in vitro or in vivo (e.g., using an animal, e.g., rodent, model system) if desired. 
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Using other, art-known variations of such methods, one can test the ability of a 
nucleic acid (e.g., DNA or RNA) used as the test compound to bind to a 
polypeptide encoded by a nucleic acid sequence located within an operon 
containing a GEP gene or homolog or ortholog thereof. 

5 In vitro, further testing can be accomplished by means known to those in 

the art such as an enzyme inhibition assay or a whole-cell bacterial growth 
inhibition assay. For example, an agar dilution assay identifies a substance that 
inhibits bacterial growth. Microtiter plates are prepared with serial dilutions of the 
test compound; adding to the preparation a given amount of growth substrate; and 

10 providing a preparation of Streptococcus cells. Inhibition of growth is determined, 
for example, by observing changes in optical densities of the bacterial cultures. 

Inhibition of bacterial growth is demonstrated, for example, by comparing 
(in the presence and absence of a test compound) the rate of growth or the absolute 
growth of bacterial cells. Inhibition includes a reduction of one of the above 

15 measurements by at least 20% (e.g., at least 25%, 30%, 40%, 50%, 75%, 80%, or 
90%). 

Rodent (e.g., murine) and rabbit animal models of streptococcal infections 
are known to those of skill in the art, and such animal model systems are accepted 
for screening antibacterial agents as an indication of their therapeutic efficacy in 

20 human patients. In a typical in vivo assay, an animal is infected with a pathogenic 
Streptococcus strain, e.g., by inhalation of Streptococcus pneumoniae, and 
conventional methods and criteria are used to diagnose the mammal as being 
afflicted with streptococcal pneumonia. The candidate antibacterial agent then is 
administered to the mammal at a dosage of 1-100 mg/kg of body weight, and the 

25 mammal is monitored for signs of amelioration of disease. Alternatively, the test 
compound can be administered to the mammal prior to infecting the mammal with 
Streptococcus, and the ability of the treated manunal to resist infection is measured. 
Of course, the results obtained in the presence of the test compound should be 
compared with results in control animals, which are not treated with the test 
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compound. Administration of candidate antibacterial agent to the mammal can be 
carried out as described below, for example. 

Pharmaceutical Formulations 

Treatment includes administering a phannaceutically effective amount of a 
5 composition containing an antibacterial agent to a subject in need of such 

treatment, thereby inhibiting bacterial growth in the subject. Such a composition 
typically contains from about 0. 1 to 90% by weight (such as 1 to 20% or 1 to 
10%) of an antibacterial agent of the invention in a pharmaceutically acceptable 
carrier. 

10 Solid formulations of the compositions for oral administration may contain 

suitable carriers or excipients, such as com starch, gelatin, lactose, acacia, sucrose, 
microcrystalline cellulose, kaolin, mannitol, dicalcium phosphate, calcium 
carbonate, sodium chloride, or alginic acid. Disintegrators that can be used 
include, without limitation, micro-crystalline cellulose, com starch, sodium starch 

15 glycolate and alginic acid. Tablet binders that may be used include acacia, 

methylcellulose, sodium carboxymethylcellulose, pol3rvinylpyrrolidone (Povidone), 
hydroxypropyl methylcellulose, sucrose, starch, and ethylcellulose. Lubricants that 
may be used include magnesium stearates, stearic acid, silicone fluid, talc, waxes, 
oils, and colloidal silica. 

20 Liquid formulations of the compositions for oral administration prepared in 

water or other aqueous vehicles may contain various suspending agents such as 
methylcellulose, alginates, tragacanth, pectin, kelgin, carrageenan, acacia, 
polyvinylpyrrolidone, and polyvinyl alcohol. The liquid formulations may also 
include solutions, emulsions, syrups and elixirs containing, together with the active 

25 compoimd(s), wetting agents, sweeteners, and coloring and flavoring agents. 

Various liquid and powder formulations can be prepared by conventional methods 
for inhalation into the lungs of the mammal to be treated. 

Injectable formulations of the compositions may contain various carriers 
such as vegetable oils, dimethylacetamide, dimethylformamide, ethyl lactate, ethyl 
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carbonate, isopropyl myristate, ethanol. polyols (glycerol, propylene glycol, liquid 
polyethylene glycol, and the like). For intravenous injections, water soluble 
versions of the compounds may be administered by the drip method, whereby a 
pharmaceutical formulation containing the antibacterial agent and a physiologically 
5 acceptable excipient is infused. Physiologically acceptable excipients may include, 
for example, 5% dextrose, 0.9% saline. Ringer's solution or other suitable 
excipients. Intramuscular preparations, a sterile formulation of a suitable soluble 
salt form of the compounds can be dissolved and administered in a pharmaceutical 
excipient such as Water- for-Injection, 0.9% saline, or 5% glucose solution, A 
10 suitable insoluble form of the compound may be prepared and administered as a 
suspension in an aqueous base or a pharmaceutically acceptable oil base, such as an 
ester of a long chain fatty acid, (e.g., ethyl oleate). 

A topical semi-solid ointment formulation typically contains a concentration 
of the active ingredient from about 1 to 20%, e.g., 5 to 10% in a carrier such as a 
15 pharmaceutical cream base. Various formulations for topical use include drops, 
tinctures, lotions, creams, solutions, and ointments containing the active ingredient 
and various supports and vehicles. 

The optimal percentage of the antibacterial agent in each pharmaceutical 
formulation varies according to the formulation itself and the therapeutic effect 
20 desired in the specific pathologies and correlated therapeutic regimens. Appropriate 
dosages of the antibacterial agents can readily be determined by those of ordinary 
skill in the art of medicine by monitoring the mammal for signs of disease 
amelioration or inhibition, and increasing or decreasing the dosage and/or frequency 
of treatment as desired. The optimal amount of the antibacterial compound used 
25 for treatment of conditions caused by or contributed to by bacterial infection may 
depend upon the manner of administration, the age and the body weight of the 
subject and the condition of the subject to be treated. Generally, the antibacterial 
compound is administered at a dosage of 1 to 100 mg/kg of body weight, and 
typically at a dosage of 1 to 10 mg/kg of body weight. 
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Using the transposon-based mutagenesis methods described above, the 
Streptococcus pneumonia genome was mutagenized, and 23 genes were identified 
as being located within operons that are essential for survival of Streptococcus 
5 pneumonia. These genes are Hsted in Table 1, above, and their nucleic acid and 
amino acid sequences are represented by SEQ ID NOs:l-69, as shown in Figs. 1- 
23. 

Now that each of these genes is known to be located within an operon that 
is essential for survival of Streptococcus, the polypeptides encoded by nucleic acids 
10 located within those operons can be used to identify antibacterial agents by using 
the assays described herein. Other art-known assays to detect interactions of test 
compounds with proteins, or to detect inhibition of bacterial grov^h also can be 
used with the nucleic acids located within operons containing the GEP genes, and 
gene products and homologs or orthologs thereof. 

15 Other Embodiments 

The invention also features fragments, variants, analogs, and derivatives of 
the GEP polypeptides described above that retain one or more of the biological 
activities of the GEP polypeptides, e.g., as determined in a complementation assay. 
Also included within the invention are naturally-occurring and non-naturally- 

20 occurring allelic variants. Compared with the naturally-occurring GEP gene, 

sequences depicted in Figs. 1-23, the nucleic acid sequence encoding allelic variants 
may have a substitution, deletion, or addition of one or more nucleotides. The 
preferred allelic variants are functionally equivalent to a GEP polypeptide, e.g., as 
determined in a complementation assay. 

25 It is to be understood that, while the invention has been described in 

conjunction with the detailed description thereof, the foregoing description is 
intended to illustrate and not limit the scope of the invention, which is defined by 
the scope of the appended claims. Other aspects, advantages, and modifications are 
within the scope of the following claims. 
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What is claimed is: 

1, An isolated operon comprising a nucleotide sequence, or an allelic 
variant or homolog of the nucleotide sequence, encoding: 

a gepl03 polypeptide comprising the amino acid sequence of SEQ ID N0:1, 
5 as depicted in Fig. 1 ; 

a gepl 119 polypeptide comprising the amino acid sequence of SEQ ID 
N0:4, as depicted in Fig. 2; 

a gepl 122 polypeptide comprising the amino acid sequence of SEQ ID 
N0:7, as depicted in Fig. 3; 
10 a gepl315 polypeptide comprising the amino acid sequence of SEQ ID 

NO: 10, as depicted in Fig. 4; 

a gepl493 polypeptide comprising the amino acid sequence of SEQ ID 
NO: 13, as depicted in Fig. 5; 

a gepl 507 polypeptide comprising the amino acid sequence of SEQ ID 
15 NO: 16, as depicted in Fig. 6; 

a gepl 511 polypeptide comprising the amino acid sequence of SEQ ID 
NO: 19, as depicted in Fig. 7; 

a gepl518 polypeptide comprising the amino acid sequence of SEQ ID 
NO:22, as depicted in Fig. 8; 
20 a gepl 546 polypeptide comprising the amino acid sequence of SEQ ID 

NO:25, as depicted in Fig. 9; 

a gepl 551 polypeptide comprising the amino acid sequence of SEQ ID 
NO:28, as depicted in Fig. 10; 

a gepl 561 polypeptide comprising the amino acid sequence of SEQ ID 
25 N0:31, as depicted in Fig. 11; 

a gepl 580 polypeptide comprising the amino acid sequence of SEQ ID 
NO:34, as depicted in Fig. 12; 

a gepl 71 3 polypeptide comprising the amino acid sequence of SEQ ID 
NO:37 as depicted in Fig. 13; 
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a gep222 polypeptide comprising the amino acid sequence of SEQ ID 
NO:40, as depicted in Fig. 14; 

a gep2283 polypeptide comprising the amino acid sequence of SEQ ID 
NO:43, as depicted in Fig. 15; 
5 a gep273 polypeptide comprising the amino acid sequence of SEQ ID 

NO:46, as depicted in Fig. 16; 

a gep286 polypeptide comprising the amino acid sequence of SEQ ID 
NO:49, as depicted in Fig. 17; 

a gep3 1 1 polypeptide comprising the amino acid sequence of SEQ ID 
10 NO:52, as depicted in Fig. 18; 

a gep3262 polypeptide comprising the amino acid sequence of SEQ ID 
NO:55, as depicted in Fig. 19; 

a gep3387 polypeptide comprising the amino acid sequence of SEQ ID 
NO:58, as depicted in Fig. 20; 
15 a gep47 polypeptide comprising the amino acid sequence of SEQ ID N0:61, 

as depicted in Fig. 21; 

a gep61 polypeptide comprising the amino acid sequence of SEQ ID NO:64, 
as depicted in Fig. 22; or 

a gep76 polypeptide comprising the amino acid sequence of SEQ ID NO:67, 
20 as depicted in Fig. 23. 

2. An isolated nucleic acid molecule comprising a nucleic acid sequence 
selected from the group consisting of: 

(1) an operon comprising the sequence of SEQ ID N0:2, as depicted in Fig. 
1, or degenerate variants thereof; 
25 (2) an operon comprising the sequence of SEQ ID N0:2, or degenerate 

variants thereof, wherein T is replaced by U; 

(3) nucleic acids complementary to (1) and (2); 
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(4) fragments of (1), (2), and (3) that are at least 15 base pairs in length and 
which hybridize under stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID NO: I; 

(5) an operon comprising the sequence of SEQ ID N0:5, as depicted in Fig. 
5 2y or degenerate variants thereof; 

(6) an operon comprising the. sequence of SEQ ID NO:5, or degenerate 
variants thereof, wherein T is replaced by U; 

(7) nucleic acids complementary to (5) and (6); 

(8) fragments of (5), (6), and (7) that are at least 1 5 base pairs in length and 
10 which hybridize under stringent conditions to genomic DNA encoding the 

polypeptide of SEQ ID N0:4; 

(9) an operon comprising the sequence of SEQ ID N0:8, as depicted in Fig. 
3, or degenerate variants thereof; 

(10) an operon comprising the sequence of SEQ ID NO:8, or degenerate 
15 variants thereof, wherein T is replaced by U; 

(11) nucleic acids complementary to (9) and (10); 

(12) fragments of (9), (10), and (11) that are at least 15 base pairs in length 
and which hybridize under stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID N0:7; 

20 (13) an operon comprising the sequence of SEQ ID NO:ll, as depicted in 

Fig. 4, or degenerate variants thereof; 

(14) an operon comprising the sequence of SEQ ID N0:1 1, or degenerate 
variants thereof, wherein T is replaced by U; 

(15) nucleic acids complementary to (13) and (14); and 

25 (16) fragments of (13), (14), and (15) that are at least 15 base pairs in 

length and which hybridize vmder stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:10; 
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(17) an operon comprising the sequence of SEQ ID NO: 14, as depicted in 
Fig. 5, or degenerate variants thereof; 

(18) an operon comprising the sequence of SEQ ID NO: 14, or degenerate 
variants thereof, wherein T is replaced by U; 

5 (19) nucleic acids complementary to (17) and (18); 

(20) fragments of (17), (18), and (19) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO: 13; 

(21) an operon comprising the sequence of SEQ ID NO: 17, as depicted in 
Fig. 6, or degenerate variants thereof; 

(22) an operon comprising the sequence of SEQ ID NO: 17, or degenerate 
variants thereof, wherein T is replaced by U; 

(23) nucleic acids complementary to (21) and (22); 

(24) fragments of (21), (22), and (23) that are at least 15 base pairs in 
length and which hybridize xmder stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO: 16; 

(25) an operon comprising the sequence of SEQ ID NO:20, as depicted in 
Fig. 7, or degenerate variants thereof; 

(26) an operon comprising the sequence of SEQ ID NO:20, or degenerate 
20 variants thereof, wherein T is replaced by U; 

(27) nucleic acids complementary to (25) and (26); 

(28) fragments of (25), (26), and (27) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID N0:19; 

25 (29) an operon comprising the sequence of SEQ ID NO:23, as depicted in 

Fig. 8, or degenerate variants thereof; 



10 



15 



wo 99/33871 



PCT/US98/27918 



-50- 

(30) an operon comprising the sequence of SEQ ID NO:23, or degenerate 
variants thereof, wherein T is replaced by U; 

(31) nucleic acids complementary to (29) and (30); and 

(32) fragments of (39), (30), and (31) that are at least 15 base pairs in 

5 length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:22; 

(33) an operon comprising the sequence of SEQ ID NO:26, as depicted in 
Fig. 9, or degenerate variants thereof; 

(34) an operon comprising the sequence of SEQ ID NO:26, or degenerate 
10 variants thereof, wherein T is replaced by U; 

(35) nucleic acids complementary to (33) and (34); 

(36) fragments of (33), (34), and (35) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:25; 

15 (37) an operon comprising the sequence of SEQ ID NO:29, as depicted in 

Fig. 10, or degenerate variants thereof; 

(38) an operon comprising the sequence of SEQ ID NO:29, or degenerate 
variants thereof, wherein T is replaced by U; 

(39) nucleic acids complementary to (37) and (38); 

20 (40) fragments of (37), (38), and (39) that are at least 15 base pairs in 

length and which hybridize xmder stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:28; 

(41) an operon comprising the sequence of SEQ ID NO:32, as depicted in 
Fig. 11, or degenerate variants thereof; 
25 (42) an operon comprising the sequence of SEQ ID NO:32, or degenerate 

variants thereof, wherein T is replaced by U; 

(43) nucleic acids complementary to (41) and (42); 
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(44) fragments of (41), (42), and (43) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID N0:31; 

(45) an operon comprising the sequence of SEQ ID NO:35, as depicted in 
5 Fig. 12, or degenerate variants thereof; 

(46) an operon comprising the sequence of SEQ ID NO:35, or degenerate 
variants thereof, wherein T is replaced by U; 

(47) nucleic acids complementary to (45) and (46); and 

(48) fragments of (45), (46), and (47) that are at least 1 5 base pairs in 

10 length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:34; 

(49) an operon comprising the sequence of SEQ ID NO:38, as depicted in 
Fig. 13, or degenerate variants thereof; 

(50) an operon comprising the sequence of SEQ ID NO:38, or degenerate 
15 variants thereof, wherein T is replaced by U; 

(51) nucleic acids complementary to (49) and (50); 

(52) fragments of (49), (50), and (51) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:37; 

20 (53) an operon comprising the sequence of SEQ ID N0:41, as depicted in 

Fig. 14, or degenerate variants thereof; 

(54) an operon comprising the sequence of SEQ ID N0:41, or degenerate 
variants thereof, wherein T is replaced by U; 

(55) nucleic acids complementary to (53) and (54); 

25 (56) fragments of (53), (54), and (55) that are at least 15 base pairs in 

length and which hybridize imder stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:40; 
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(57) an operon comprising the sequence of SEQ ID NO:44, as depicted in 
Fig. 15, or degenerate variants thereof; 

(58) an operon comprising the sequence of SEQ ID NO:44, or degenerate 
variants thereof, wherein T is replaced by U; 

5 (59) nucleic acids complementary to (57) and (58); 

(60) fragments of (57), (58), and (59) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:39; 

(61) an operon comprising the sequence of SEQ ID NO:47, as depicted in 
10 Fig. 16, or degenerate variants thereof; 

(62) an operon comprising the sequence of SEQ ID NO:47, or degenerate 
variants thereof, wherein T is replaced by U; 

(63) nucleic acids complementary to (61) and (62); and 

(64) fragments of (61), (62), and (63) that are at least 15 base pairs in 

15 length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:46; 

(65) an operon comprising the sequence of SEQ ID NO:50, as depicted in 
Fig. 17, or degenerate variants thereof; 

(66) an operon comprising the sequence of SEQ ID NO:50, or degenerate 
20 variants thereof, wherein T is replaced by U; 

(67) nucleic acids complementary to (65) and (66); 

(68) fragments of (65), (66), and (67) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:49; 

25 (69) an operon comprising the sequence of SEQ ED NO:53, as depicted in 

Fig. 18, or degenerate variants thereof; 
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(70) an operon comprising the sequence of SEQ ID NO:53, or degenerate 
variants thereof, wherein T is replaced by U; 

(71) nucleic acids complementary to (69) and (70); 

(72) fragments of (69), (70), and (71) that are at least 15 base pairs in 

5 length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:52; 

(73) an operon comprising the sequence of SEQ ID NO:56, as depicted in 
Fig. 19, or degenerate variants thereof; 

(74) an operon comprising the sequence of SEQ ID NO:56, or degenerate 
10 variants thereof, wherein T is replaced by U; 

(75) nucleic acids complementary to (73) and (74); 

(76) fragments of (73), (74), and (75) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:55; 

15 (77) an operon comprising the sequence of SEQ ID NO:59, as depicted in 

Fig. 20, or degenerate variants thereof; 

(78) an operon comprising the sequence of SEQ ID NO:59, or degenerate 
variants thereof, wherein T is replaced by U; 

(79) nucleic acids complementary to (77) and (78); and 

20 (80) fragments of (77), (78), and (79) that are at least 15 base pairs in 

length and which hybridize imder stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:58; 

(81) an operon comprising the sequence of SEQ ID NO: 62, as depicted in 
Fig. 21, or degenerate variants thereof; 
25 (82) an operon comprising the sequence of SEQ ID NO:62, or degenerate 

variants thereof, wherein T is replaced by U; 

(83) nucleic acids complementary to (81) and (82); 
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(84) fragments of (81), (82), and (83) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID N0:61; 

(85) an operon comprising the sequence of SEQ ID NO:65; as depicted in 
5 Fig. 22, or degenerate variants thereof; 

(86) an operon comprising the sequence of SEQ ID NO:65, or degenerate 
variants thereof, wherein T is replaced by U; 

(87) nucleic acids complementary to (85) and (86); 

(88) fragments of (85), (86), and (87) that are at least 15 base pairs in 

10 length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:66; 

(89) an operon comprising the sequence of SEQ ID NO:68, as depicted in 
Fig. 23, or degenerate variants thereof; 

(90) an operon comprising the sequence of SEQ ID NO:68, or degenerate 
15 variants thereof, wherein T is replaced by U; 

(91) nucleic acids complementary to (89) and (90); and 

(92) fragments of (89), (90), and (91) that are at least 15 base pairs in 
length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:67. 

20 3. An isolated operon from Streptococcus comprising a nucleotide sequence 

that is at least 85% identical to a nucleotide sequence selected from the group 
consisting of 

SEQ ID N0:2; 
SEQ ID N0:5; 
25 SEQ ID N0:8; 

SEQIDN0:11; 
SEQ ID NO: 14; 
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SEQ ID NO: 17; 

SEQ ID NO:20; 

SEQ ID NO:23; 

SEQ ID NO:26; 
5 SEQ ID NO:29; 

SEQ ID NO:32; 

SEQ ID NO:35; 

SEQ ID NO:38; 

SEQ IDN0:41; 
10 SEQ ID NO:44; 

SEQ ID NO:47; 

SEQ ID NO:50; 

SEQ ID NO:53; 

SEQ ID NO:56; 
15 SEQIDNO:59; 

SEQ ID NO:62; 

SEQ ID NO:65; and 

SEQ ID NO:68. 

4. An isolated nucleic acid molecule that is at least 15 base pairs in length 
20 and hybridizes under stringent conditions to a nucleotide sequence selected from 
the group consisting of 

SEQ ID N0:2; 

SEQ ID N0:5; 

SEQ ID N0:8; 
25 SEQIDN0:11; 

SEQ ID NO: 14; 

SEQ ID N0:17; 

SEQ ID NO:20; 

SEQ ID NO:23; 
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SEQ ID NO:26 
SEQ ID NO:29 
SEQ ID NO:32 
SEQ ID NO:35 
SEQ ID NO:38 
SEQ ID N0:41 
SEQ ID NO:44 
SEQ ID NO:47 
SEQ ID NO:50 
SEQ ID NO:53 
SEQ ID NO:56 
SEQ ID NO:59 
SEQ ID NO:62 
SEQ ID NO:65 
SEQ ID NO:68 



and 



5. A vector comprising an operon of claim 1. 

6. A vector comprising a nucleic acid molecule of claim 2. 

7. An expression vector comprising an operon of claim 1 operably linked 
to a nucleotide sequence regulatory element that controls expression of said operon. 



8. An expression vector comprising a nucleic acid molecule of claim 2, 
wherein said nucleic acid molecule is operably linked to a nucleotide sequence 
regulatory element that controls expression of said nucleic acid. 



9. A host cell comprising an exogenously introduced operon of claim 1. 
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10. A host cell comprising an exogenously introduced nucleic acid 
molecule of claim 2. 

1 1. A host cell of claim 9, wherein the cell is a yeast or bacterium. 

12. A host cell of claim 10, wherein the cell is a yeast or bacterium. 

5 13. A genetically engineered host cell comprising an operon of claim 1 

operably linked to a heterologous nucleotide sequence regulatory element that 
controls expression of the operon in the host cell. 

14. A host cell of claim 13, wherein the cell is a yeast or bacterium. 

15. A genetically engineered host cell comprising a nucleic acid molecule 
10 of claim 2 operably linked to a nucleotide sequence regulatory element that controls 

expression of the nucleic acid in the host cell. 

16. A host cell of claim 15, wherein the cell is a yeast or bacterium. 

17. An isolated operon comprising a nucleotide sequence encoding a 
polypeptide comprising an amino acid sequence selected from the group consisting 

15 of: 

the amino acid sequence of SEQ ID N0:1, as depicted in Fig. 1; 
the amino acid sequence of SEQ ID N0:4, as depicted in Fig. 2; 
the amino acid sequence of SEQ ID N0:7, as depicted in Fig. 3; 
the amino acid sequence of SEQ ID NO: 10, as depicted in Fig. 4; 
20 the amino acid sequence of SEQ ID N0:13, as depicted in Fig. 5; 

the amino acid sequence of SEQ ID NO: 16, as depicted in Fig. 6; 
the amino acid sequence of SEQ ID NO: 19, as depicted in Fig. 7; 
the amino acid sequence of SEQ ID NO:22, as depicted in Fig. 8; 
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the amino 


acid 


sequence of SEQ ID NO:25, 


as 


depicted 


in 


Fig. 


9; 


the amino 


acid 


sequence of SEQ ID NO:28, 


as 


depicted 


in 


Fig. 


10; 


the amino 


acid 


sequence of SEQ ID N0:31, 


as 


depicted 


in 


Fig. 


11; 


the amino 


acid 


sequence of SEQ ID NO:34, 


as 


depicted 


in 


Fig. 


12; 


the amino 


acid 


sequence of SEQ ID NO:37, 


as 


depicted 


in 


Fig. 


13; 


the amino 


acid 


sequence of SEQ ID NO:40, 


as 


depicted 


in 


Fig. 


14; 


the amino 


acid 


sequence of SEQ ID NO:43, 


as 


depicted 


in 


Fig. 


15; 


the amino 


acid 


sequence of SEQ ID NO:46, 


as 


depicted 


in 


Fig. 


16; 


the amino 


acid 


sequence of SEQ ID NO:49, 


as 


depicted 


in 


Fig. 


17; 


the amino 


acid 


sequence of SEQ ID NO:52, 


as 


depicted 


in 


Fig. 


18; 


the amino 


acid 


sequence of SEQ ID NO:55, 


as 


depicted 


in 


Fig. 


19; 


the amino 


acid 


sequence of SEQ ID NO:58, 


as 


depicted 


in 


Fig. 


20; 


the amino 


acid 


sequence of SEQ ID N0:61, 


as 


depicted 


in 


Fig, 


21; 


the amino 


acid 


sequence of SEQ ID NO:64, 


as 


depicted 


in 


Fig. 


22; 


the amino 


acid 


sequence of SEQ ID NO:67, 


as 


depicted 


in 


Fig. 


23. 



18. An isolated polypeptide encoded by a nucleic acid located within an 
operon comprising a nucleic acid sequence selected from the group consisting of 
SEQ ID NO: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47, 50, 53, 56, 
59, 62, 65, and 68. 

20 19. An isolated polypeptide, said polypeptide being encoded by an operon 

of claim 1 . 

20. An isolated polypeptide, said polypeptide being encoded by a nucleic 
acid molecule of claim 2. 



21. An isolated polypeptide, said polypeptide being encoded by an 
25 operon of claim 3. 
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22. A method for identifying an antibacterial agent, the method comprising: 

(a) contacting a test compound with a polypeptide, or a homolog of a 
polypeptide, encoded by a nucleic acid sequence located within an operon 
comprising a GEP gene selected from the group consisting of gepl03, geplll9, 

5 gepll22, gepl315, gepl493, gepl507, geplSll, gepl518, gepl546, gepl551, 
gepl561, gepl580, gepl713, gep222, gep2283, gep273, gep286, gep311, gep3262, 
gep3387, gep47, gep61, and gep76; and 

(b) detecting binding of the test compound to the polypeptide, wherein 
binding indicates that the test compound is an antibacterial agent. 

10 23. The method of claim 22, further comprising: 

(c) determining whether a test compound that binds to the polypeptide 
inhibits growth of bacteria, relative to growth of bacteria cultured in the absence of 
a test compound that binds to the polypeptide, wherein inhibition of growth 
indicates that the test compound is an antibacterial agent. 

15 24. The method of claim 22, wherein the polypeptide is selected from the 

group consisting of gepl03, geplll9, gepll22, gepl315, gepl493, gepl507, 
geplSll, gepl518, gepl546, gepl551, gepl561. gepl580, gepl713, gep222, 
gep2283, gep273, gep286, gep311, gep3262, gep3387, gep47, gep61, and gep76. 

25. The method of claim 22, wherein the test compound is immobilized on 
20 a substrate, and binding of the test compound to the polypeptide is detected as 

immobilization of the polypeptide on the immobilized test compound. 

26. The method of claim 25, wherein immobiHzation of the polypeptide on 
the test compound is detected in an immunoassay with an antibody that specifically 
binds to the polypeptide. 



wo 99/33871 



PCT/US98/27918 



-60- 

27. The method of claim 22, wherein the test compound is selected from 
the group consisting of polypeptides and small molecules. 

28. The method of claim 22, wherein: 

(a) the polypeptide is provided as a fusion protein comprising the 

5 polypeptide fused to (i) a transcription activation domain of a transcription factor or 
(ii) a DNA-binding domain of a transcription factor; and 

(b) the test compound is a polypeptide that is provided as a fusion protein 
comprising the test polypeptide fused to (i) a transcription activation domain of a 
transcription factor or (ii) a DNA-binding domain of a transcription factor, to 

10 interact with the fusion protein; and 

(c) binding of the test compound to the polypeptide is detected as 
reconstitution of a transcription factor. 



29. An antibody that specifically binds to a GEP polypeptide of claim 19. 



30. An antibody of claim 29, wherein the antibody is a monoclonal 
15 antibody. 



31. A method for identifying an antibacterial agent, the method comprising: 

(a) contacting a polypeptide encoded by a nucleic acid located within an 
operon comprising a GEP gene with a test compound; 

(b) detecting a decrease in function of the polypeptide contacted with the 
20 test compound; and 

(c) determining whether a test compound that decreases function of a 
contacted polypeptide inhibits growth of bacteria, relative to growth of bacteria 
cultured in the absence of a test compound that decreases function of a contacted 
polypeptide, wherein inhibition of growth indicates that the test compound is an 

25 antibacterial agent. 
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32. The method of claim 31, wherein the polypeptide is selected from the 
group consisting of gepl03, geplll9, gepll22, gepl315, gepl493, gepl507, 
geplSll, gepl518. gepl546, gepl551, gepl561. gepl580, gepl713, gep222, 
gep2283, gep273, gep286, gep311, gep3262, gep3387, gep47, gep61, and gep76. 

5 33. The method of claim 31, wherein the test compound is selected from 

the group consisting of polypeptides and small molecules. 

34. A method for identifying an antibacterial agent, the method comprising: 

(a) contacting a nucleic acid comprising an operon containing a gene 
encoding a GEP polypeptide with a test compound, wherein the GEP polypeptide is 

10 selected from the group consisting of gepl03, gepll 19, gepll22, gepl315, 
gepl493, gepl507, geplSll, gepl518, gepl546, gepl551, gepl561, gepl580, 
gepl713, gep222, gep2283, gep273, gep286, gep311, gep3262, gep3387, gep47, 
gep61, and gep76; and 

(b) detecting binding of the test compound to the nucleic acid, wherein 
15 binding indicates that the test compound is an antibacterial agent. 

35. The method of claim 34, further comprising: 

(c) determining whether a test compound that binds to the nucleic acid 
inhibits growth of bacteria, relative to growth of bacteria cultured in the absence of 
the test compoimd that binds to the nucleic acid, wherein inhibition of growth 

20 indicates that the test compound is an antibacterial agent. 

36. The method of claim 34, wherein the test compound is selected from 
the group consisting of polypeptides and small molecules. 

37. An isolated nucleic acid or an allelic variant thereof encoding: 

a gepl493 polypeptide comprising the amino acid sequence of SEQ ID 
25 NO: 13, as depicted in Fig. 5; 
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a gepl507 polypeptide comprising the amino acid sequence of SEQ ID 
NO: 16, as depicted in Fig. 6; 

a gepl546 polypeptide comprising the amino acid sequence of SEQ ID 
NO:25, as depicted in Fig. 9; 
5 a gep273 polypeptide comprising the amino acid sequence of SEQ ID 

NO:46, as depicted in Fig. 16; 

a gep286 polypeptide comprising the amino acid sequence of SEQ ID 
NO:49, as depicted in Fig. 17; or 

a gep76 polypeptide comprising the amino acid sequence of SEQ ID NO:67, 
10 as depicted in Fig. 23. 

38. An isolated nucleic acid comprising a sequence selected from the group 
consisting of: 

(1) SEQ ID NO: 14, as depicted in Fig. 5, or degenerate variants thereof; 

(2) SEQ ID NO: 14, or degenerate variants thereof, wherein T is replaced by 

15 U; 

(3) nucleic acids complementary to (1) and (2); 

(4) fragments of (1), (2), and (3) that are at least 15 base pairs in length and 
which hybridize xmder stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID N0:13; 

20 (5) SEQ ID NO: 17, as depicted in Fig, 6, or degenerate variants thereof; 

(6) SEQ ID NO: 17, or degenerate variants thereof, wherein T is replaced by 

U; 

(7) nucleic acids complementary to (5) and (6); 

(8) fragments of (5), (6), and (7) that are at least 15 base pairs in length and 
25 which hybridize under stringent conditions to genomic DNA encoding the 

polypeptide of SEQ ID NO: 16; 

(9) SEQ ID NO:26, as depicted in Fig. 9, or degenerate variants thereof; 

(10) SEQ ID NO:26, or degenerate variants thereof, wherein T is replaced 

by U; 
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(11) nucleic acids complementary to (9) and (10); 

(12) fragments of (9), (10), and (11) that are at least 15 base pairs in length 
and which hybridize under stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID NO:25; 

5 (13) SEQ ID NO:47, as depicted in Fig. 16, or degenerate variants thereof; 

(14) SEQ ID NO:47, or degenerate variants thereof, wherein T is replaced 

byU; 

(15) nucleic acids complementary to (13) and (14); 

(16) fragments of (13), (14), and (15) that are at least 15 base pairs in 

10 length and which hybridize under stringent conditions to genomic DNA encoding 
the polypeptide of SEQ ID NO:46; 

(17) SEQ ID NO:50, as depicted in Fig. 17, or degenerate variants thereof; 

(18) SEQ ID NO:50, or degenerate variants thereof, wherein T is replaced 

by U; 

15 (19) nucleic acids complementary to (i) and (j); 

(20) fragments of (i), 0)> (k) that are at least 15 base pairs in length 
and which hybridize under stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID NO:49; 

(21) SEQ ID NO:68, as depicted in Fig. 23, or degenerate variants thereof; 
20 (22) SEQ ID NO:68, or degenerate variants thereof, wherein T is replaced 

by U; 

(23) nucleic acids complementary to (21) and (22); and 

(24) fragments of (21). (22), and (23) that are at least 15 base pah's in 
length and which hybridize under stringent conditions to genomic DNA encoding 

25 the polypeptide of SEQ ID NO:67. 

39. A method for identifying an antibacterial agent, the method comprising: 
(a) contacting a test compound with a polypeptide, or a homolog of a 

polypeptide, encoded by a nucleic acid sequence located within an operon 

comprising a B-yneS gene; and 
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(b) detecting binding of the test compound to the polypeptide, wherein 
binding indicates that the test compound is an antibacterial agent. 

40. The method of claim 39, further comprising: 

(c) determining whether a test compound that binds to the polypeptide 

5 inhibits growth of bacteria, relative to growth of bacteria cultured in the absence of 
a test compound that binds to the polypeptide, wherein inhibition of growth 
indicates that the test compound is an antibacterial agent. 
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(SEQ ID NO: 2) ^ yCCTCATTTrra»fi*AACTTTATTACAGATAAAACACTCTAAG^ 

(SEQ ID NO: 3) 

101 ATAAATTCAOTAATJUUXy^TGWUTTACATAAATATTTAAAACTATCC^^ 

TArrrAACTCCATrATTCCTACTCTAATCTATTTATAAATTrrCATACCCCTTAATAtnTCCCACC^ 

(SEQ IDNO:l)l MRLDHYLlVSBIIItRIlTVAICSVADKOH 37 

2 0 1 ATCAAS(nTAXT<Xl^TCTTCCCCAAAACTTCAACC(IllCTTCAAA(nTAATCA 1 lO CCAATAA ili lULlULi it^A AAAACTAC 1 a a 

TACTTCCAATTACCTTAGAACCCgrrTTCAACTrCCCTCAACTTTCAATO 

2«IKVMCILAICSSTOLKVllOOVEIRrC«KLLt.VKVL SI 

301 TACACAT GAAAC ATAGTA CA,AAAAAA CAACATCCACOU»AATgrATCAAATTATCAgTa^ -q. 
ATCTCrACTTrCTATC\ lUUlUiiLliLl ACCTCC rCCT T ACATACnTAATACTCA L 1 1 : U i U CCCAT L i 1 L 11 1 1 ACACATTTTTAT AACATOTTA 

«' EMKOSTKItEOA ACMYEIlSETRVEEHV* 
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(SEO ID NO: S) ^ CXAATCOTTTCCXATOTtUCTCrXCCCXTOU^CTTTAT^^ 100 
^ • CTrr)«X;CAAACCrrTACACTCACATCCCTXCTTCCCAAAT*CrTGCGAC*AC^ C I LL . i i A t 1 1 1 ^ . LL . i C I U. i U C 

(SEQ ID NO: 6) 

101 CCCCMyUUkTCACTTGTCAATTCTCCCAAACTACrrACAACTTTGAT^^ jqo 
ecu* 1 L U 1 AaTGAACACTTAACAC a; 1 1 1 a ATCAATGTTCAAACTA L 1 1 11 LL iU* A LL 1 UL 11 LA CTAACCACTCTTTAGAATTATCTCCAAAATACT 

(SEQ ID.N0;4)l HItRTWR»SrVTNLNTPFKI19 

3 0 1 TTCCCAATATTCACATTCCCAATCCTA CLL.1 1 1 1 ACCCCCTATGCCTCCCCTCACCAACTCAC LL 1 ULUl ACeATecauUUCACCTCCGACCTCGACT 100 
AACCgrTATAACTCTAAOCCTTACCATCCCAAAATCCCCCATACCCACCCCAL 1 <#U U U ACTCgCAAAGCATCCTAC U. 1 1 1 IL I Ui ACCCTCCACCTCA 

30 CNIEJPHRTVLAPHACVTNSAPRTXAKELCAGL 53 

JO I CCTTtrrAATQGAAATCgrCTCTGAauUKMAATCCAATAC^ 400 
CCAACATTACCTTTACCAGACA Llu; ILLLU ACSTTA IUI lUl IC^Li U U 1 OCGACCTATACGAACTATASCTACTC CCGl i 1 1 1U »ACASAGATAC 

51 VVMEMVSDrClOYWNEltTLHMLHlDEGEHPVSl flS 

401 CAA Cl i 1 1 It^ ACCCATCAACAdlCCCTiUICACCCCCACCACAATTCATCCAACAAAACACCAACACC^ 500 
CTTCAAAAACCATCCCTA Ll 1 C:i>I CCGATCCTCCC LUl ILl l AACTA U. i I L 1 U f G lOU 1 1 C 1 U^ CTATACCACCTATACTTCTACCCGACGGCAC 

aftOLFCSDEDSLARAACFIOEHTKTOIVOIHMCCPV 119 

0 

5 0 1 TCAA CAAAA TCCTCAACAACCAACCTCCACCTATCTGGCTCAACCATCCTCACAACATCTACrCTATCATCAACA^ C'l 0 1 LL^ TCATATCCC 600 

A O 1 Hi 1 1 U ACCA C . 1 L llUC 1 . CGACCrCCATACACCCACTrCCTAOGACTCTTCrAGATCACATACTA G I 1 1 CaW^CTCACACAGCAACTATACCC 

120 NKIVKKEACANW LXDPDKl TSI INKVQSVLDI P 1S3 

«0: ACTTACTCTCAAAATCCCTACCCCCTCCGCCCACCCATCTCTGGCACTACAAAATCCrCTCCCrrCCT^^ 700 
TCAATCACACTT77ACCCATCCCCCACCCCCCTCCCTACACACCC7CA7 C1 ' . * I ACCOGACCCACCACrCCCACCTCCACAAACACGCGAGCGCTACCTA 

ISl LTVKMRTCUAOPSLAVCNALAAEAACVSALAHK 18} 

70 : CCCCCTACCCCTGAACAAATGTATACTCCCCACCCAGACCrrGACACCCTTTACAACKJTTCCCCAACCTCT BOO 
CCCCCATCCCCACTTCriTAO^TATCACCGCTCCCTCTCCIUkCrCTCCaAAATCr^ 

ie4CRTRE0«YTCHACI,ETLTKVA0ALTKIPriANGD 21» 
80 1 ATATCCCTACTCTCSAACAACCCAACCAACCCATCCAACAAt i T I tX I CCTCACCCACTCATCATTCCCCCACCTCCCATCCCAAATCCTTACCTCTTCAA 900 

TATAcccATCACAccrrcrrcccrrccrrcccTACcrrcrrcAAccAccAcn-ccGTCAGTACTAAc^ 

22 0 IRTVOEAKORtECVGACAVMICRAAHGHPYLFH 253 

901 CCAAATCAACCATTACTTTCAAACA£KACAAATCCTACCTCykr^ XOOO 
CCTTTACTTCCTAATGAAA t M ACCATCCACTAAACTCCAAA L : 1 L I b 11 C7ACTTCTACCGCA7CCTTCTCAACTTTCCTAACTAATTC 

OlHHYFETCEILPOLTFEOKMlCIAyEHLItRtlK 36$ 

1 0 C : CT CAAA CCA CAAAA CC7CCCA U « 1 LU 1 U AATTCCCCCCCCTCCCTCCTCACTATCTCCCTCCAACATCrCCCCCTCCCAAACTCCC ICCACCCATTTCCC 1100 
CACTTrCCTCTTTTC CACCCTCAACCACTTAACCCGCCCCACCCACGACTC ATACACGCACCTTCTACACCCCCACOCTTTTWW^ CACCTCCGTAAACCC 



3»i'^KCEKVAVREFBCLAPHYLRCTSCAAKLflGA 



ISO 119 



: : 0 1 AACCTAGCACCCTACCACACATTGAAC CCCI C I IL CAArrCCAC^UUXICrrAATACTTTAAA^ 1 AAACACTCTCTTCAATCCCCCCA 120 

TTCGATCCTCCCATCCrrTCrAACrrCCCCXGAACCTTAACCrcrrCCGAATTATCAAATTrrO^ 

320AST:.AE:EAL;,OiEKA- jj4 
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C<;F:0 TD NO* ft ^ ^ AACCCACCXCCTGCAACTTTTCCCrCXTXTrr^ • 

(SEQ ID NO: 9) 

1 0 1 TACTAGXTTTTGXAATCCnTTTTCAGCTXCrrrCTCACTC^ , . . 

»Te*TCTA*AACTTTMa»»AAXACTCCATCAAACACTCMnrCTCT^r :(,C l l^ 

20i CTCACCATCATTAATCCXCTTAT L . 1 L U 1 JUlCATTACaLfcT*ACrrCACAAACCA i 1111 ATCA*TATCCTATTTTTTC«UTATTCTCTCAC^C* laa 
GACTCCTAGTAATTAXXTPCAATACAACAAATTCTAATCCTTATTCAACTCTTT^ ' ° 

JOl T77TCAGTGC11 lOL , « i AAACCATAACrrCCTA<UTOCCACATTCrTACCATAACAAAA-iTCACC^ 1 1 H.L 1 i A *flo 

AAAACTCACGCACGAAATTTCCTATTCACCATCTCCCCCTCTAAGAATMTArrcrrrTAACrCC^ 

401 TCACCTTATCTCTCCATAACATAAAACCAACAATTCTATCT^ . 
ACTCCAATACACASCTArrCTA: i t Ibl lAACATACUACCCACTA TAT CGTAAACACCCCT AATACTTm^C^ TACT rTJLTrT<^a»^^ 

S31 TTCAACTTrrCTOATTTTCATACCTCTATTATAACTCAA^ .^.^ 
AAerrTCAAAACACTAAAACTATCGACATAATArrCACTTTTACACTArrCTATCCCCATACTTACACrrr^ 

(SEQ ID NO: 7) I m n l x v k o k i p i. k x k j* 

CCGTACCCTTAArrCCCACTCCCrrACCCCAAAATCCrTTTTTGTAATCACAAACATCCT 
ISBMCINCECICFYOXTLVPVPCALKCEDrYCOITS 48 

CATAATCTCCCTTCAAACAACTTCtrrrTTAATCAC7TCaiCT7CrTCTTCA(UTTTAAACCrTA^ 

J^^SW^VEAKLLKVKKKSKFHIVPSCTIYNECCG 01 

n5£5iii''"^5Ei'~=*'^*"^'^**^*^CCTCCACTTCAACACC<^ *0q 
CACCCTTTAGTACCTCGACGTAATACTATTCCTCGACCTCAACTTCTCCCTCAATCAACTACTTTOCCAC^^ 

^= Cfr''W-»1^0KO:.£rKTOLLHOAtKKrAPAGyEK lU 

ATACTTTAACCAOCTTGATAACCrrACCTCrTTGCTTTTATAATCTCTCCATTaUTCTTAAACTCTCACCT^ 
••-^ = *'*P*:°«OtPr.YVRAKl.0rOTI»KrKK0VKAGL 1«» 

i 0 : : ;"ATCCACAAAACTCT»CTATTTA^ 

A . A . ACCTSTTTTCACACTSATAAATCATCTCAArrrrCTGACCCACCATCTTCTATTCCrTTCGCT ^ "° 

^AOMSHYLVELKOCLVODKETOVIAHRLAELLT 111 

* *' " ni;2J£iSJIISiJI?JSSTSi5iS^'^~*°*^^^ 1200 

AATACTCCTCTAACCTTAGTGCCTACTCTCTTTTCAAGATCCACACCCATCATAATACCACGCTCCCCCCTCTTT 
* « 0 » P • • D « » K ••• L G V R T 1 H V B R A R K T c 0 V 0 1 I : 

' • °- ^i£i±^SE^°^^rTT^?Tg^^^<^^***<^CrrCCTTAAACATTTCCC^ 

.AATCrrTGCCCGTCGAATTAAATTCACTTAACCArrrTCTCAACCAATrrCTAAACCCTCTTCA^ 

::SVT»RCLKLTCLV|tELV(COrPCVVTVAVNTHTAKT 34S 
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249 SEiyCEKTSXIWCOESIOECVLWYErSLSPRAF 351 

1401 TTATCAACTAAATCCTCACOUaOUUACTCCTCTJlTACCCJUtS i-OO 
AATACTTGXTTTXCCA C 1 LLl 1 1 0 1 Li ! U CCACATATCC L 1 i LU i C»TTTTCCCCXCCTXC»ACTA i ' 11 L 1 !(. 1 UU i AXACrAACTCCgiUTAXCACCT 

313 YgLKPEOTEVLYStAVKAUDVOKIOHLXOATCC JM 

1501 grrCCAACGATTCGA i i 1L.LL1 1 lU CAAAGAA^CTAAAAACACTCACACCTATO^kTATrATTCCACAACCTATTGAAGATGCCAAC^^ ISOO 
CAACCTTCCTAACCTAAACCCAAA LU i U L . ; ! CA t * 1 H U ACTCrCCArACCTATAATAACCTCTTCCATAACTTCTAC GO i I Lw.L i 1 1 ACCAT7T7 

1XSVCTXCPAFAX1CVXTLSCMOZ I PEA XBDAXRNAKR 146 

l«01 CAATCCCATmiACAATACTCATTATCAACCTCXyuCCCCAGAACACU^TTXTTCL : LV* 1 I OCTACAACGAACGCTACCGACCACATCCriTCATTtnTCA noO 
CrrACCCTAAACTCTTATGACTAATACTTCCACCTTCCLi; 1 C T TCrCTAATAAGCACCAACCATCTTCL i 1 tLUATCCCTCCTCTACCAAACT AACAACT 

349 MCfOKTHyEACTAlETlPltWrKECYHAOACIVD J81 

1701 CCCACCACCTACACCTCTCCATCATAACTTATTACATACTATTCTTACrrATCTrACa^CAAAAAA 
CCCTCX:TCCATCrrCCACACCTACTATTaUkTAATCTATCATAACAATa^ATACATCCT 

183 PPRTCLDOKLLDTILTYVPEKMVYI SCMVSTLA 414 

1801 CCTCATrrCCTACCCTTACTACyuWTCTATCATCTrCXTTATATCCACTCGCTCCATAT^ 1,00 
CCACTAAACCATCCGAATOlTrrTCACATACTACAACTAATATAaJTCACCCACCTATACAACCCTCTATC^^ 

4l5RDLVRLVEVtDLHY10SVOMFPKTARTEAVVICLI 441 
190 : TAACAAAACrrTTAAAAAACTACTTCACAAACTTTGAAAAGACTCTATAATAGTAAGACrrCAAAATA^ 3OOO 

ATTcrrrrcAAA m ; : . i catcaa cio; : : cAAArrrr:7rcACATATTATCA7TC7CAA L ; : : i ATrcTTCACTCCAKccAAccACTTececAArrcTC 



300; ACC Ll ; CACCCCGCTAACACCCCTTCCAATCCCgTACCCACTATOCTATSTTGCCGrrCGAACACTTCATCAAAAACTTTA 3084 

TCCCCAAAACTCCCCCCA7TGTGCCCAACCT7ACCGCA7CCCTGATACCATACAACGCCAACCTTCTCAACTA C] ! 1 1 lU AAAT 
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(SEO ID NO:il) * aacac ci cl . . i l : ; u i atttxtcttaccaxa i : i ll l i u mTr*ccTACTACCATAC L>. i u i i i u i xcTC«rrjuuu>ACACccTArrTauukTrcxc 

^ ^ TTCT Cfi ArrfSAAAaAAAAATAAATACAATCtrrTTAAACCGACTTTAATCCATa^TCCTAT C^ I 1 1 lUi HXA TAAACTTTAACTC 

(SEQ ID NO: 12) 



101 TTTCAGACeATCTACCATCGAAAAATCTCTTATAATAATCGAAAAOGAGAAC CC CATCCACJ^^ 

AAACTCTCCTACATOCTA CL U 1 i 1 AGACAATATTATTA LL . li i LL i H 1 CCCgTAmn t TT L - l AAAATAATTA 1 L 1 1 ACTACTCCACTAACCACTT 



(SEQ ID N0:10) ^ 



HRKILLIEDDOVIRO 



200 
15 



201 CACATTGGGAAAATCCTCTCTQUkTCCCCArrTNAACTCCTCCTWn"ACAACACTTTATO jOO 
CTCTAACCCTTTTAOywyUlACTTACCCCTAAAKTTCACCAfXyiCCATCTTCTGAAATAC^^ 

KOICKnLSEWGPXVVLVCOrMSVLSLFVOSEPKLV <» 

JOl TCCTCATOCATA ilLtUilH^ CC LilU; 1 l AATCCTTATCA ClC^ll^i aiCCJUUlTCCCCAACATTTCCAACCTACCTATCA t^ I ^Li HJ CAOACA 400 

ACCACTACCTATAACCAAACCCGAA C AAATTACCAATACTCACCACACrrCCTTTACCCCTTCTAAACC^ 

SO LMOlCLPLFNCYHWCQEIRKISKVPIMFLSSItD t2 

401 CCAGCCTATCCATATrCTCATCCCAATCAATATCCOCCCGCATGA C > i * U i U ACCAAC H. IIIL LA CCACCAC L. i i L ; 1 i I ACCTAAGCTTCACCCCTTG SO 0 
CCTCCCATACCTATAACACTACCCTTACrnATACCCCCCCCTACTGAAACACTCCTTOaUAAACTO^ 

B) OAHDIVNAINNCADDFVTKPrOOOVLLAXVOCl, 115 

so: 1 i 1 cm 1 CCTATtSACnTTCfMgCTCJLTCJLCACTTTerTGaAATATCL 1 11. 1 1 ATCCTCAATACeJUUtTgeATCCATTTArATT^TT-AAmWCft^ «00 
AACCCAOCAACCATACTCAAACCCOCACTACTCTCAAACCACCrrATACOACCACiaTACGAaTT^ 

ll<LRRSYEFCROE5t.:.C¥ACVILNTXSMDLHY0COV 14» 

«0: TCTTCAATTT^CCAAGAATCAATTCCACWTrrTACGCCTCrTATTT^ 700 
ACAACTT AAAL 1 U* ! I L : I ACTrAACCTCT AAAATCCCCACAATAAACTCCTACCTCCCTTCTAGCATCCTCCACTCCTCGACTAOT 

ISO LMLTKNEFCILRVtFEHACNIVAROOLHltELMN 192 

1 0 : CACTCACrrfrrCATTG ATGATAATACCrr CTCTCTCAATCTCCCTCtrrr^ S 0 0 
CTCACTCAAAAACTAACTACTATTATGCCACACACACTTACACCCACCAXACCCATTrrTCAACCT^ 

Hi SDFriDDKTlSVKV'ARLPKKLEEOCLVCFIETK 21S 

8 0 ; AAAC C AATAOCKn'ACCCATTCAACCATCCTTCATTGGAAACAArrTTTTrTASCCTATCTCCGCTCC^ .... l ATCTAT C luLI ULU It. 90 0 

TrrCCTTATCCCATCCCTAACrrCCTACCAArrAACCTrrSTTAAAAAACATCCCATACACGCGACCGCATCA^ 

3l«XCtCYCLKHA- 



9 0 1 CCA! i ICIH^ILIIACT ,. . ,LA^ ATT7CCCAC7C7AGSAA7TTA C 1 I CUI CTA C. ! 1 > iL. ILl lUiUI «UL J 1 lUl AACCATC7TATTTTTCA 

CGTAAACAACACAATCAGAAACTCAAAAATAAACCCTCACATrrTTAAATCAACGACAT n K l> A tk ft rt »fceAeJUtCCJL>ArATTr■ffT*qjil^T^^AAACT 
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(SEQ ID NO: 14) ^ xjuuua^cACTWwwxxccxAaiccTTCcccAT^ loo 

(SEQ ID NO: 15) a U IC X a i C A UU ilU.IU.llUi U MUtfCCCTAWTCC^TTCTTTCCXCCXTACCCrrCGXAJU>CT 1 11 LL I lU^ TCrgTTCCGfcC 

(SEQ ID NO: 13) l X0TCT T»TPaXLCItll*OMATFVIOfrKaTLATL 33 

xoi CTTceeATTATTTTTCAirrACAAa kji^iiiLiLLiUU i iLiiiuJ i Lliiiu^^ aoo 

J4LP1IPHL0CV. SPLIFCLLAVIOHTrPXfACriCGC «7 

201 CTAACCCTCTCCCAACCACTCCTCGACTafcTTTTCCCATTTCOCCCTATC ITCTCTCTCTACCTTCCCATTAI C 1 1 CT TTOCACTCTCATATCTTCCCAC 300 
CATTCCGA<>C CC I T GG T CA CCACCTakCrAAAACCCTAAACCCOCATAGAACACAfa»G»TtKAACCCTAATACAACAAA C C^^ 

«8 KAVATSACVirCFAPirCLYLAlirrCLfiYLCS 100 



3 0 1 TAT<UTTTCACTCTCTACTCTCXCACCATCGATCCCCCCll-l lA 1 4 1 
ATACTAAAGTCACACATCACACTCTCCTACCTACCOCCCACAAT 

101 M I S L S S V T A S r A A V 114 
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(SEQ ID NO: 17) . SJ*«Si??Ii^Si???S?I??iiS?i?rcS^^ 

(IS 5S SSile! > . , . . , . « . . s T - = . . V . » : . r . . . T c . , V „ 

„ » , V I O « T D Y 0 T f » S V B T I U ! r r L P F * T T C V » » Y (} 

«j CLRAISWVXDMICKDLHRTFSSLrYtCt^CTlLT 9S 

„TAVYi:,AYPLrrTDHPlVItKVyi,VMC:0l.lA01F 129 

4 01 TTTCAAT CCAATCCCTCXATGAACCTCTCCAXAATTACAu: 1 ILlti i 1 iA CAAAAC TCC 4 60 
AAACTTAGCrrACCCACrrACTTCOACACCrrn-AATCTCAAAGAGAAAATCT^^ 

130 slCWVMEALEHYSrSfTKI. 1«B 
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[SEQ ID NO: 20) ^ SIScrlllTcSiCTAOTlIIcT^ 
(SEQ ID NO: 21) 



1 0 1 CCCaCTACCC^TCCfcCJlTTCXAAAXAgrrTTAACCCCOlO 1 CI LCH ATCCCAASCTXn'ATCTACTC GCia CCCCCATTCCCXATCTACATCATATCACT 
CCCrCXTCCCTfcCCTCTJU Uii i i i 1 I CXAAXTTCCCCGTCAGACCGATA LLUl i CU ^TAaATCA U. U 1 lU-Ui CTAACCCTTXCATCTACTATACTCA 



;SEQ ID NO: 19 ) i 



MQZOKSrKGOSPYGKLYlVATPICNLDDnT 



300 
JO 



301 TTTCCTCCTATCCACACCrrGAAACAAOTCCACTGCATTCCTSCTCAt^ 100 
AAACCACCATACCrCTGGAA C U i L 1 1 CACCTCACCTAACGACCACTCCTATgCCCCTTATCTCCCCAAAACCAGTTCCTAAAACTCTAAAG U i U« H L i; 

31 FRAIOTLKCVDHIAAEDTftKTGLLLKHPDlSTKO «4 

J 0 1 ACATCACrrh'CATCACCAaulTCCAAACCAAAA^^TTCCTCATTTGA UUU 1 U L i iLXAAGCACGCCAAACTATrGCTCACCTCTCIXUTCCCCCTTT 400 
TCTACTaUUUCTACTCCTgrrACCTTTCCTTTTTTXAWSACTAAACT 

6S ISPHSHMAKCKIPDLICrLKACOSZAQVSDACL 97 

401 CCCTACCATTTCACACCCTKn'CATGATTTXCTTAAGGCACCrATTGAGGAACAAATTCCI ^ 1 IHL CC 500 

CCGATCCTAAACTCTOCQACCACTACTAAATCAArTCCSTCCATAACT L;.* *L1 1 I AACCTCAACACrCACAACCTCCATGCAGA CU*LL. 1 AAAOACOO 

98 PSISDPGHOLVKAAICECIAVVTVPCTSACISA 130 



5 : : TTGATTCCCACr Uji 1 I ACCCCCACACCCACATA7 1, . . . * ACO ai 1 M 1 i ACCCAGAAAATCACCTCAACAQAACCAA I 1 1 i i 1 CCCTCT AAAAAAGATT 600 
AACTAACCCTCACCAAATCCCCCTGTCCCTCTATAGAAAATCCCAAAAAATCCCT L: 1 1 i ACTCCAC l HSrC ri CaU AAAAAACCGACA i i 1 . ULi AA 

DlLIASCLAPOPKIFYGPLPRKSCQOXOrFCSKKDY 1C4 



6 0 1 ATCCTCAAACACACA : 1 . > , l ATCAATCACrrai7C?7G7ACCACACACC77CCAAAATATCTTACAACrCTACCCTCACCCCT LU,t iU. . * it. CTCAC 100 
TACGA LU ' lOI CTCTAAAAAATACTTAGTGGAOTACCACATCCTCTCTCCAA L^. 1 . . ATACAATCTTCACATCCCACTGCCCACCCAACAAAACCAaTC 

US PETQirYESPHBVADTLEMMLEVYGORSVVLVIl 19t 



7 0 : CJAATTCACCAAAATCTATCAACAATACCAAACACqTACAA i . :Li^J lATTGCTCGAAACCA7rrCTCAAACCTCrCTCAAGCCTCAAiU: ICJIUATT 
CrrrAA ClU; : I n AGATA LULI i ATC<rrrrC7CCA7CTTAAAGACrrAACCACCTT7CC7AGACACmCCAGACACTTCCCACrTACACJ^^ 



:9» ElTKIYCCYOdCT'.SE 



E 5 t S E 



L K C E C L L I 



800 
310 



S : : 37TCAAGC7CCCACCAAACCTC7GCACCAAAACGATCACCAjmA C 4 * . . L . 1 AGAAA7CCAACCCCC7ATCCACCAACCCA7CAAGAAAAATCAACCTA 
CAA C7TCCAC0GTCC7TT CCACACC7C , .... CC7Ar7CrrrC7CAA CAAGAA 1 C U . AGGTTCGCCCATACCT CG i f LCU i A C 4 « L ACTTCGAT 

SJlVECASKCVEeKDEEOLrLCIOARIOQGHKKHOA: 



77AAGaAAATACCTAACArrrACCACTCGAATAAGAC7CAACrrCTACCCTCCCTACCACCACTCCCAACAAAAACAA7AAACCCACAC^ 1000 
AA77CrrrrATCCATTCTAAATCCTCACC77A77C7CAGrrGAGATCCCAC:CCA7CC7CCTCACC L. .L. 1 i 1 ILl i ATTTCCC7C7CTCC7ACATTAT7 

KEtAKIYOWNKSC'. YAAYHOWEEKO- 390 
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(SEO ID NO: 23 ) l ATGCCTTCCTTAXXWUUUKTO«ATCCTrr^ 100 

Tn Mrt.O/\ TACCCXXCOUlTTTTTTTCCACCCTTACCACJUU^TrWC^ 

CS£Q ZD aQi^H) 



X 0 1 TTJgTTTCXJUiarrTTXCCTTgrCCTATJUkTJUlfcTTrATOATAAAAAAT^ 2 0 0 - 

AXTTAWlCTTTCaUUkTCCJUWJ^CCATATTATCTAAATACCTATTrrrrATAL U i i i iACAGACTCCTAAACCCTCfcCTCCAXl I Ul* 1 1 i AACTATCC 



(SEQ ID NO: 22 ) ^ 



MOXKyEKISQOLOVTLlCOXDT 



201 (rrTCTAACTTT£a,CACCTCAACCGCCCACTArrCCCTrrATCt:CCCGTTATCCCAACGACATGA j o 0 
CAACATTCAAACTCTCCULi ILCL L U L ltJ ^TAAOCqAAATATOgCCCAATAC UUl I CC IG TACTCACCATCA^ 

32VLSLTArCATIPriA"TfRKOMTGSLDlVAIItAII 5S 

3 0 1 TTG ATTTCCATAAAACTCTGACAAATCTeAAT<UCC(rrAACCAACL 1 U i L i i ACCTAAOATTCAA CAACAAC CTAACTTaACCAACCAATTGaAACAACC 4 C 0 
AACTAAACCTATTTTCAaACTCTTTACAOTTACTCGCATTCCTTCXUCACAATCCATTCTAAC^ 

56 DLDKSLTKLNDRltEAVLAKIOCQCKLTXlLCEA 66 

4 0 1 TATCTTACT7CCCCAAAAA'iTACCAGACCTTCAACAACTCTATLl ILL i lATAAC CAAAA CCCTCCTACCAA CCCWAC CArrCCCCCTGAACCTCCACTC 500 
ATAGAATCAACGGCTTTTTAATCCTCTCCAA C i . U . 1 U ACATAGAAOOAATA U LC U > * CCCACCATO O 1 1 ULU 1 1 l^C TAACCGCCACTTCCACCTGAC 

B9 :LVAEKLA0VEELYLPYREKRRTKATIAR£ACL 131 

501 I ; I C C T CT 1 C CrCCTTTCATTTTCCACAATATACTTCACTTACACAAACAACCT GAAAA CTTCGTCTC^ CCTTCA COO 
AAACGAGLAACtjACOVAACT ^AAACt J T CT T ftT ATCM" T gTTT CTT eGAgTTTTCAAC CAGACACTTCCTAAACCCTCACCCTTCCTTCCGAACT 

l22rPLARLlLOWlVOLtKEAEKFVCECrATCKEALT 155 

601 CCCCTCCACTTCATA L 1 1 1 I CCAACCCTT ATCCCAACATCTCAC L . . > I CTATGACTrATCAGCAACTCCTOAGACACTCTAAACTCACTTCTCA 700 
CCCCACCTCAACTATAAAACCACCTTCGCAATACCCrTCTACACTaSAACCCAAGATACTCAATACn'C^ 

156 CAVOILVEALSEDVTLRSMTYOEVLBHSRLTSO 

101 ACCCAAGCATCAAACTCTTCATGAAAAGCACCTrrrrCACATTTATTATSAT^^^ SOO 
TCOCTTCCTACTTTCAGAACTA Ll I i iwi^. CCAAAAACTCTAAATAATACTAAAAACTCTCTCTCAACCTTGATACCTTCCCATACCATGCAACCCACAC 

119 AKDESLDEKOVrorYYOrSETVCTMOCYRTLAL 231 

801 AATCSTCCCCACAAACTTCCTCTCTTCAAGATCCCTTTTCAACATCCCACOGA^^ 900 
TTAGCACCCCTrTTTGAACakCACAACTTCTACCCAAAACTTCTACCCTCCCTGCCATAACAACCCAACAA^ 

;:2NHCEK:.CVLItICrEMATDB::.ArrATRFKVKHftY 255 

9:1 ATATTCATCAACTTCTTCACCAATCCCTTAAGAAAAA Ui 1 C : lOCv 1 G rTArrGAGCCTCSTATTCCCACACAATTAACTCACAAACCTCAACACCCAGC 1000 
TATAACTACrrCAACAACTCCTTACCCAATT ^. . ; . . CCACAACCCACCATAACTCCC>CCATAACCCTCTCTTAATTCACTCrTTCGA L - l I L i (.CU 1 CG 

25( IDEVVOOSVKKKVlPAlEFRlHTELTEItAEECA 3BI 

1 0 : : TATCCAA L CTCACAATCTCCCCAATCT :., : L ll (Lg l I C CTCOi CTCAAACCCCC CO.GGULiiU aATTTCACCCAC CLiULLi ACAOCTCCC 1100 
ATACCTTCAAAAAACACTCTTACACCCC7TACACCAGAACCAACCAOGTCACK r CCLL CCeACCAACAACCTAAACTCCCT CC CAAACCATCTCCAOCG 

219 :cUrSCNt.R»LLiVAPLKCRVVLCPOPAPRTCA 121 

: : : ; AACTTACCTC? C C T CCATCCAACACCAAAAATCCTCACAACrCACCTTATTTATCCTCTTAAACCACCATCACCTCCTCA^ 1300 
TTCAATCCACACCACCTACCTTCTC L 1 . i 1 1 ACCA U . L « I GACTCCAA7AAAT AGCACAATTTCCTCCTACrCCACCACTTTAC C 11 L i 1 CUU llC i 1 i C 

i::r. LAVVOATCKnLTTOVIYPVKPASAROXEEAKKD )SS 

is;: ArTTAGCACATTT AATTCCTCAATACGCTCTAGACATTATTCCCArrOCAAATGCAACGGCCAGTCCTCAA^^ 1 3 00 
TAAATCSTrrAAArTAACCACTTATCCCACATrrCrAATAACOCTAA L!.. . . ACCTTSCCSCTCACCACTTrCACTTCCAAAACATCCCCTTCAACACTr 

)l( :.A = L:CCYCVE:iAlCKGTASRESCArVAEVLK 30B 
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1301 ACA 1 1 1 LLL lU UlCTCXCCTATCTTATCCTTAATtJUa i* 1 U« ; t>L 1 1 L ' i\j I L 1 ATTCT GCCACCGAXC TTCCTCgT CACGAC rrrCCfcCACrTC;^^ 1400 
TrT hM J BG<lti CrTC^GTiyihrKCJkMM ^ ' CAACCASCACTCCT r AAA rvr rtrrajuirTOg^'V^ 

309 DrPXVSYVIVMESCASVYSASELAROCrPDLTV 431 

X 4 0 1 CAAAJUtfrCCTCTCCCATTTCrATCC L - LU^itun iCC AACATCCTrrroCSCAATTOCT CAAAA TCSATCCTfcACTCAAI i CO I U m;OTCAATACCAAC I S 0 0 

422 EK*SAISlAR»tODPLAILVKIDPlCS ICVCQYOH «59 

1 SO 1 ACQATtyrCACTCACAACAAACTATCrtUCJUnrTOai . 1 H« i HI I CGA TACACrCCTTAACCAA Oi 1 1 U i CAATqTCAATACACCrACCCCACCTcf 1«00 
TCCTACACTCA UiUi IH i 1 LA TACACTCTCACACCTCAAACAACACCr ATCrCACCAA 1 iU*l f CA ACCACACTTACACTTATgTCCATCCCCTCSACA 

45« DVSOKICLSES1.0rVVOTVVIIOVOV«VHTASPAL 4a« 

1601 TCrrTCACACTASCTCCACTCAACAAAACTATCTCTCA^ 1700 
ACAAACTCn^CATCCACCTCA C TT G TrTT a ATACACA Cl i riATAACACTrrATGCCCCTCL'lTCirCCTTTTTACTC^^ 

489 LSHVAGLMICTISEHZVItYAECECKITSRAOXKK S31 

1701 Gl I CCl COIL IG GGACCCAACC CL 1 * 1 L* AaCACCCTCC7C o . . 1 UU H Lk.l ATCCCrCAAACTACCAATATCCTTCATAATACACCACTTCACCCACAC 17J9 
CAACCACCACACecrCKrrrCCGGAAACTCCTCCCUCCUCCAAACGAACaiTAXyyUCTTTCATO^ 

S33Vp|tl,CAKAPECAACrLRI PtSSNI CDHTCVHPE SS4 
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(SEQ ID NO: 26) j tactpccccxacc o i m l i i allc ; i (J i LXATgrcxxc G i l : um u uuu ^T«7TCAACTTAACJ^TrTTau;jw;c*cTc^Ac^ xoo 

(SEQ ID NO: 27) ATGA CLC L L.1 ILCU UUUUIATGCCACXAGACTTACXCrrCOUJUaCJU Li i 1 I XCCACTTCXATTCTAAAACTCTCCTCA Ui lUL* I CSCTOrrACCCC 

(SEQ ID NO: 25) l TCARVSypVLIIVKVFLSKCEVXIPltALirCAXXS )) 

101 ACOTCTCATOCUACCATCCmKCACATArfcTAATAAA 1U« llj 1 i LLL U 1 UAALU HI I Li:: OGACACCCCCTAACAC 1 1 1 LUACACCCACTCCTACTA 300 
TCCAGACTAC L : l U* l ACCACOTCTATAAOirrATTTACCAaUkGCCAAACTTCCAAAACCACCTCTCC C C C ATTCTCAAACCTCT^^ 

J4RSDRTMVA01VIII0VPrERrBCDCLTVSTPTCST «7 

JOl CTCCCTATAACAACT C ' Hill L, CCC C 1 lilLl f ACACCCTACCATTCAAC L 1 1 i l. CAATTAACCCACATTCCa^;CCTTAATAATCCrrcrCTATCCAAC 300 
CACCCATA l i U H CACAGAACCCCCACCACAAAATCTOCGATggTAA L i I CU UkACCTTAATTCCCT^AXgSCTCCGAATTATTACCACAGATACCrro 

Ce AYNKSLCCAVLHPTIEALOLTCIASLNHRVTRT 100 

301 ATTtKWCrcffCa^TTArrmCCTAAGAACCATAAC^ 400 
TAACCCCACAACCTAAT AACACCCATT L llLVl ATTCTAACTraAATAAG U i 1 U . i L t . 1 L CTAATAGTATGATAAACCCAA C 1 U U ATCCCAAATAAOA 

101 LCSSi:vPK|COKIE LIPTBNDYHTISVDNSVYS 131 

401 TTCCCTAATATTGAGCCTATTCAGTATCAAATCC ACCATCATAACATrCA L ill U 1 CCCCACTCCTACCCATACCA U 1 U U 1 1. CAAC LL. 1 L» 11 AACCATC 500 
AAWCATTATAACTCOCATAACTa^TACTTTACCTCCTAOTATTCTAACTCAAACACCCCTCACCATCCCTATGCTC^ 

134FltNI£RI EYQIOKKXIHPVATPSHTSPWNRVKDA 167 

5 0 1 CCTrrATCOCTCACCTCCATCAATOAO L I i lu AATTTATCSCACATOAACATCTCAACCTTAAGAC L U 1 i l AAAAAA 576 
CCAAATACCCACTCeACCTACTTACTCCAAACrrAAATACCCTCTA Ll ILi ACACTTCCAATTCTCCAAAAA l 1 1 11 1 

l«« r I C E V 0 E • 175 
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(SEQ ID MO: 29) i cccTCTJuwJuu^ccTxcTaucxcTt^^TxcATg^^ lc. i 1 xtccocwcai u > I re r c dc c i uicfcATATATAccrccT loo 

(SEQ ID NO* 30) CCeXCA i U ILM lUJ kTCXCCrCTCACTATCTACCCTTCXTCATXATJUUUkCTACGXJUkTACCCCTCT i ATATATCCAOax 

(SEQ ID NO: 28) 



1 MVVCWOYIPA 



101 CCACACAACGCOGTTACtU!It ;G r C Ci;C tCU UUW^TAaACA liUL]Lii ACACCACA iiU.Uli ATTTTCCTC^^ 3 00 

U;iUlUl ICLLLU UlTCCTAACCACaAACACCTTCTTATCTCTAACCACAATCT CC TCTAACCAAAXTAAAACCACTT IH T AAAC 

llPKKCVTlCPSPRIEIALRPO»*fTfrC0DCV1.0EPV 44 

201 TTOCCAACCAACTTTTAGAAGCAAAAACTGCT ACCAATACCAACXAACATCAT^^ j 0 0 

AAC U^ULUil CAAAA lLilLUil Ui CACCATCCTTATG Ul.Utllt^i AaTAeC LL.lLil ATACTATCGCTTCCTCT Li 

45 CKO VLEAKTATNTHICKHCSEyOSQAtltRVYTrE 71 

301 AMTCACCCTTUTTTAT^TACrrrAAAAACTCCrr^ 400 
TCTXffrCCCATCAATAGTATGAAATTTTTGACCAACCTAAATACTTCTCCCAATAACCATAATAAA lU r C T ILCl ACCACCCAAACTAAGACCCTACTTC 

78 DORSYHTLKTCWlYEECYMYTLOItDCGFDSRXH no 

4 0 1 ACATTCACOCTTCCACACCTACCACG r fl oi TCCCTTAAGCAT7ALLL IL'! I ACCTATCATGAAGACAACCT AAAA CCACCTCCATCCTACTATCTACATC 50 0 

TCTAACTCCCAACCTCTCCATCCTCCACCAACCCAATrCCTAATCCCSACAATCCATACTACrrCTC^^ 

UlBLTVCELAROWVXDyPlTYDCEICLKAAPWYYl,OP 144 

5 0 1 CACCAACTGCCTCCCAAAACCTTCCCAACAAA7CCTACTAeC?C :vl : ! CA TCAGCAGCTATgCTAACTCCCTCgrATCAACATCCTTTAACTTCqTACTA 600 

CTCCTTCACCCA LLLi 1 i iU CAACC U. ILU i ACCATCATCCACCCAACTACTCCTCCATACCATTCACCa^CCATACTrCTACCAAATTCAACCATCAT 

145 ATCWOHLCMKWYYLRSSCAHVTCMYODCLTHYY 177 

«01 CCTAAATCCACGTAATCQ^CACATOAAGACACGTrCCTTCCAACTCAATGCTAACTCCTACTATCCCTATCATTC^ 700 
CCArrTACCTCCATTACCTCTCTACTTCTSTCCAACCAACCTTCACrrACCATTCACCATCATACCCATACTAACTCCACC^ 

lie LttACHCDHKTCWFOVHCHWY YAYOSCALAVHTT 310 

^ 0 : CTACCTCCrr ACTACrr AAACTATAATCCTCAATCCGTTAACrr AATCAAMCT AArr^ « 0 0 

CATCCACCAATCATOAATTTCATATTACCACrrACCCAATTCATTACTTCCCATTAAOlTTTCACACT^ 



2i:VCCYYLKYNGEWV|C 



335 
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9«plS(l rig. II 

(SEQ ID NO: 32) i TTTTATCGATATTTATATTJUUIWUWCCXTTATrCXCCXC^ 100 
(SEQ ID NO: 33) *AAATACCTATAAAT»TAATT C . ; I LU* l JUlTAACTCCTCAACTC»C(KCrXCTATOCCTrCAC*Aa»ATCCTCrATTC^^ 

(S£Q ID NO: 31 )i MOiYixKAiiHorspooTrLrLADRrLMiTFn jj 

1 0 1 ATC0JUW1WTACCTACGTAJUUUUVATTGAACATC7CT 300 
TACCTrCTrATCCUTGCATTrrriTAACTTCTACACATAACTCTACTTCCCT^^ i i l ACCGAACAACTTACTATAATgTC 

j3 icCYLKKItlEHVYSDKXICTGXrCSEWPrFKHITD <i 

301 ACCA l I ;Cf IC ttACACATCAgrAAOCCI^CTAATCTCr CC AAACACGACTTTACCATrTCT CSAAAA TCT CAACA^^ i ! I mnCAATT 300 

TCCTAAACAACCTCrCTACTCATTgaJ^CCGATTAaiU^ ^ CTTTCTCCTCAAATCCTAAAGA 

if DLLeTSVTLAMLWKEEfSISBBLKTIIDLlPVOr 99 

301 TTCTAAACAAKnxrrAGAACATrrCCCTTTCTTCCCAATTCCCCTCC^^ 400 
AACA men C CACATCrrCTAAACCCAAACAACCCrrAACCCCACCCCCTCTCCAACi aCC T ttC AC L L I LL ! L 11 CAACTATTACCrrACrrCCACTCA 

100 SKEOVEHPAFLRIALRETLTHLGCEVDNPIKLT 133 

* 0 1 CACAATAACCTCCCT C CATTTCCAACCCCTCCTCACCAGCLL'l 1 O i I CAATCTTCAGACTCCCAACTATCACCTGATT C AAA AACGAATCAAOTXCA 500 
GTCTTArrCCACCGACCTAAACCrrGCC«CCACTCCrCCayU^CCACai^ 

U3 0HKLPGrCTCAOEAl.VVIILOSRKYKL:tKRI RYN lt( 



SOI ACCQCACrrTTTTCAACTATTTTrCACATAAT L 1 1 L 11 LL 10 1 CC CTeCTAACATTTCTCCr AAAAAA TCrATCAA GGAAC TC CAAAAAAC ACCeCACAO (00 
TCCCCTGAAAAAACTTCATAAAAACTCTATTACAACAACGACACCGACGATTCTAAAOAGGATTTTTTAGATAUl ILL! lUAC H i 1 * . *L TC GG gTCTC 

IST GTFLKYrSONLLAVAPKISPKKSlKELElCTAOR 199 



60 : AATTCCTCAATCTTTTAACACAGATCATTTTCAATTTCAATCCAAGCTCJtAATCACCTATT^ 700 
TTAACCACTTAGAAAArrCTGTCTACrAAAACTTAAAaTTAOCTTCCAGTTTACTCCATAAAACTTGTTCCATL 1 * L . . 1 LL 1 1 ACTTAACACTGGACTC 

100 lACSPHTODrorOSXVKSAIPItNLEESNELSPE 3J3 

AAATTCCCTAATCA LL > . . i CACAACAATCTCACCGrr LX* : I .LA GC777ATTGACCAACTCACACAACCCCTACCAGAACCTCTTCAATTTCATCAAA 800 
TTTAACCCATTACTGCAAAAA .. . L . . . . ACACTGCCCACCAAACTCCAAATAAC7GCTrCACTCTCTTCCCCATCC7CTrcCACAACT7AAACTACTTT 

:)> KLANDL FDKKI-TARLSr I DQVREAVP EPVOFOE I 3«i 

a:: TTCATCCCACTCCCCAATTAAACAAArrTCAAAACCAAAAACTCTCrrTATCAAATCCAArrCACCT^^ 900 
AAC7»CCG7CACCGC7TAA ; 1 . si: . . AAA L . I . . . > 1 GACACCAATAC7T7ACCT7AACTCGACTACCAACCC7TAT7CCAGATA LI ILIL CPCCT 

36'; CASRQLXKFENQKLSLSNGIELIVPHHVYODAE 399 

9 : : GT rrSrrCACrrr ATCCAAJUlCCAAAATCCAACCTArr rrATCrr AATCAAAAATATCCACCATATCCAAACTA^ 1000 
CACACAArrCAAATAC s* . . . 1 LC > . . . ACrrTCaATCACATACAArrA U t . . 1 1 ATACCTCCTATA Uj 1 1 i CATTTA7TACAAA 1 1 1 * 1 AAC L 1 1 H C 

JC: S V E F : 0 » E K C T Y S 1 1. : K N 1 E D I 0 S K • J2S 



: 3 : : rs . . .h actaccact li ill; 1 1 1 illiu^ ctataaacctta llllli ila tcaacatctcaaacaactcatcacctatcaacccatgctcccacaaat 

ACCAACA7CATCrrCACAA C GAAAAACCACCCATAT77CGAATCCCCCAACTAGTTCTACAG > 1 *L1 I CAGTACTCCATACTTCGCTACa^CC LILI I l A 



1100 
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(SEQ ID NO: 35) j JUU^TCTCCTATAATJUrrACWUUUkTfcCTTCrO^^^ 

{SZ^ ID !*0- 36) "rrTACXCCXTATTXTGXTCnTTTTATCJU^aurCTCCfcACGTfcATXC^^ 

(SEQ ID NO: 34) 1 KAirrMxrLivcvtLLviv 



100 
19 



lOl ACACTCACTACACnTTA j U LU i CASOGTCCCTCCCCXTTATTCAXCC L I L I U.U WTAC CA*AA Ct?TTCeTAATACCCCTATTCATATTCCCT 300 
TCTOACTCATCTCWU^TACACCAACCACTCCTCACCCACCCCTAATAACTTGCQ^ 

JOTLSTVYVVRQOSVAIXBKfGKYOItVAIISCIHIRL 51 

3 0 1 TC H. i U 1 O CGATTCACTCGATTCCACCACCa ATTCA U I lUtUL i 1 U i 1 U aAACTCATA i i U i M CACACT AAaACCAA C CACAA i U II. 1 ! LL. 1 1 AT J 00 

AcroAAftftrcnrAAfrrafttKTAAcr;Tcr;TqrrTft*^''J"^ CT^'^ > cgTTTCACTATAACfcccAACTCTCArTCTOCT i ecTcrrACACAACCAATA 

54 PrClDSlAARIOLRLLOSDIVVETKTKDHVrVM 8( 



401 CATCCTCTTCCCT Li ICIUI i CCAAAATTAACCTTCGATQkA i lUll lU ACAAAAAACfcTCAC Al lUCCH i U ACCTTCAACACCAAqr ACCACAACAAA 500 
CTACCACAACCG AGAACACAAC t* 1 1 11 AATTgCAACCTACTT AACAAACT L 11,111 CTACTCTAACOCCUACTCCAACTTCTCCTTCAT Cb 1 C i i L i i 1 

130DALI15SVPKLTLDCLPEKKDEIALCV0M0VAEEH 153 



$01 TCACO^CTTACTOCTACATTATCCTCyUUU^CCTrCATTACCAAaiTCtUOlCCAGATCCA^ (00 
ACTCGTCAATCCCCATCTAATJmCAC 7 T T T G CAACTAATCCTTCCACCTTGCTCTACO f CYrCAATTCOTTACATACrTACI f 1 ACTTACCCCCC C TT G C 

154 TTYCYllVKTLITKVEPDAEVKQSMMEXMAAOH IBfi 

«0: 7AACCC0CTCCCACeACAACAATn»CCCAACCTCACAACA7TAAAATTCTCACTGCACCTGAACCCCAA0CAa 700 
ATTCCCCCACCCTCGT G TT CTTAACCCCL 1 1 L U ACTCTTCTAAnTTAACAGTGACGTCGACTTCCC L i 1 LU 1 L 1 > 11 1 L 1 U lCOCAACTACCACACCCC 

187 KRVAAOELAEAOXIKIVTAAEAEAEXDRLHCVG 319 

7 0 1 ATTCCCCAAaU^CCTAACCCCATTCTCCATCCArrCCCXCAErrCTATCACCCAACTCA^ 1 O CCATCACACAAGAACAAATCATCTCTA BOO 

TAACC GG r I G : rCCA7TCCCCTAACACC7ACCTAACCCTCTCACATAgrc»CTTCACTTCCT7CCGTTAOaCCC^ ILlLllL.lUUl ACTACACAT 

23CIAOORKAIVDCLAESITELKCANVGHTEEOIHSX 351 



8C : TCCTCTTCACCAACCACTArrT(gA7ACCTTCAATA CL : 1 ■ .U^>: CTAAACCAAATCAAACCA7 ^1 . 1 1 1 ACCAAATACTCCAAATCCTGTCCATCATAT 900 
ACCAGAACTCCTTCGTCATAAACCTATCCAACrTATGCAAACGGAGArrTrrTTTACTTTGGTAGAi^^ 

354 LLTHOYLDTLNTFASKCHOTIPLPHTPNGVOD: 3K 



9 0 : CCCTACACAAA IL . . ^ * CACCC L 1 1 LL CCCTCACAACAAATAATACACTAATACTCTTCCAAAATCrCTTauUCT ACCTCAC CO 1 LG 1 L 1 1 U CCCTATA 
CCCATCTGTrrAGAACACTCSCCAAGCCCSACT L* : LI 1 l ATTATCTCATTATCACAAG Ci 1 1 1 AGACAA tjl 1 1 G ATCCACTCGCACCACAACGCCATA7 

38'r PTQILSALRAEKK* 
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(SEQ ID NO. / SSA^CTATACCACCrATTTTATCCCAAAAWTAAAACCrrrrCCAAASG^^ 

(SEQ ID NO: 39) 

101 TCTtUAT(>CCAaATTTCCACyUUAACGATTCArr^^ JOO 
AGACTTACTCtrrCTAXACCTCTCrrTCCTAACTAAAACTTTACTTATCCCAAATAACTT^ 

(SEQ ID NO: 37 )i LUSICPIIItLJtCLSSKELILLC JJ 

201 AATTATCCTjuCTATCTTTrrACCCrrTTAT L I LI LI k; I ACTTCTA L 1 L 1 U I H ATATATTATCACTTTCATTTTTACA CCACACA T CAAAAC TATTCTT 300 
TTAATJVCCATT1^TAC»KAAAATggrW^J^AT^^ ^ r^TrAXe^TCACJiCJULATATATAATACTCAAACTAAAAATgTCCT C I C I ACTTTTCATAACAA 

23 IJLSirLPfYLPVVVtCLTlISLlFTGDKKSll. 55 

3 01 CACAAAATCOGGaACCATCCCATGCTt;(.'l'i H I 1 ! 1 L 1 f ACCTATASTACTCTTATATCCAl 1 L 1 1 aCACAAAA TrOCATCCCTCTTG rCCCTTCACTAO 400 
UlLl U l ACCCCCTCCTACCCTACGACCAAGAAAAACAATCCATATCATCACAATATACCTAAGAALt^iL*! 1 i iAACCTACCCACAACACCCAACTCATC 

SSOKHGEHPMLLLFLSYSTVISILAOMWHGLVASVC S9 

4 0 i CAA iCi 1 1 i L I ATTTACTA l 1 1 i L. . U iU O^CTATCACTCCATTTTATCCCATAAArrCrTTCCArTtUTTrrCCax, 1 1 CU. H 1 L, i I lOGTAmUlL 1 1 S 00 
CTTACAAACUlTAAATCUTAAAACAAAAACCTCATACTCACCrAAAATAOGOTATTTAAGAAACCTAACTAAAACCT 

90 MFLFTirFLHYOSILSKKPFRLILOrVLrOSVL 123 

501 CTCAC C 1 UL ! . : i U CCACTTTAGAACATTrCeAAATTCTCAACAAATTTAACTATC L : ! L , i U LA CCCAATATGCACCTCTCCCATCAGAACCCOCCA <00 
CACTCCACCAAAACCCTCAAATCTrCTAAAGCTTrAACA L - i IL ' . i i AAATTCATACGAAAACAAACTGCCTTATACCTCCACACCCTACTCTTOGCCCCT 

123 SAArASLEHFCIVKKFMYArLSPWMOVMHOHRA 15S 

601 CAACTCAC ^ 1 1 L 1 . i AATtCrAATTATTATCCAATTATrrC 1 1 U > > 1 ^ 1 U ATTATCATrcrrTTCTATCTCTTTACAACCACCAACTTCAA 11 CA* i 1 UA 700 
CTTCAC7GGAACAAATTACCATTAATAATACr7TAATAAACAACJUUkCACATAATACTAACCAAACATAGAC^ 

154EVTrFKPMYYCI ICCFCINIAFYLPTTTKLHMLK IBS 

70 : AACTATTCTCTCTCATTCCACCCT7TgTTiUk7CTrrT7CCTTTCAACm 1 1 ILLiG CrATTATCCCTCGACCAATTATCTA 100 

TTCATAAGACACACTAACCTCCCAAACAATTACACAAACCAAACTTGAAATCACrrTrrACCrrCACGCAAAG^ 

19C VFCVIACFVKLFGLNFTONRTAFPAltAGAIIY 323 

BC; TCTCTrTACCACTArrAAAJUkCTCGAACCCCTTrrGGCTTACTATTCCGCTCrrCC^ 900 
ACACAAATGCTCATAATTmGACCTTCCCCAAAACCCAATCATAACCCCAGAACCCCTAACCAAACTCAAAOCACAAAAaATCACT 

2:J LFTTlKHWKAfWLSIGVFAIGLSFLrSSDLCVR 255 

901 ATCCCTACTTTACACTCTTrrATCCAACAACCCArrTCTATCTCGCATCCTCCC^ 1000 
TACCCATCAAATCTOkGAACATACCTTCTTCCGTAAACATACACCCTACCACCCTACCCCAA<>^TTC^ 

256FGTLDSSMCER: SIWDACHALFKONPFWCECPLT 389 

1001 CCTATATCCACTCrrATCCTCtXIATACATCCTCCTT ATCATCAACATCCCCACACTCTTTATAr^ 1100 
GGATATACCTGACAATACCAGCCTATGTACCACCAATACTACTTCTACGCOTCTCACAAATATAACTATGCTAACACTC^ 

29: YnHSYPRIKAPYHEHAHSLYtOTXLSYCXVCTI 323 

no: TTTATTA C . 1 1 1 U . L 1 ! . 1 U * 1 UL . CC 1 U 1 i CSCrrCATCATCCATATCACTCACCACTCCCCn* AACCTCCCATTATCCC H- 1 1 1 ATCTA 1 ^ IM LL . 1 1300 
AAATAATCAAAACACAACACAACCAGCACAAGCCAACrACTACCTATACTCACTCCTCACCCCCrrTGCACCCTAA 

U) ;«LVLSSVAPVRI,HHDKSOCSCKRPi:CLYLSFL 15S 

; : : : AZASTCCTTCCTarSCACCCAA l . ! . :U irrrGCrrrrrTTC7GCATTCACTCASCrrrrA . . . l^* tU CTACTTATCTCCACTATTCCATTGCCnTA 13 9» 
TCTCACCAACCy^CACUTCCCTTAAAAACTCAACCCACACAACACCTAAGTCACTCCCAAATAAAACAA^ 

JS6TVVAVMG: rCLALFH IOSCF: FLLVMCS I PLAL 16S 
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fcvn Tn Mn- 41^l Axayu7TGx*»TCTSCCTaxrrxcrrauirrt^ loo 

(5tg 10 riU. TxcCTCACTTCTACXCCGJU;CXATGJUUrrrAACTACrrrC*TACCCACTACTTrAAC^ 

(SEQ ID NO: 42) 

1 0 1 AAACCTTCTTCCTCCXCXACCTAGATCTCCTXCTAACTACCCTGAC^ 200 
TTTCCJUkCAC(KACCTCTTCCa^TCTA«CCATCXTTGATOCCXCTCTCT 

201 ACACTTGAATTCCCAAAACAAAATCCX a; 1 ! H O CAACCXACTCACCCAT L 1 : * 1 1 1 t^ TTCCCATCTrCCCLU lUAATCCA : It* I I m I ACA* 300 
TCTCAACTTAACG O 1 i U U > U I ASCTCCACCAAA LL 1 i 1 1 G ACTCCCTACACGAAAACCACTAACCCTACAACCCCCACTTACCTXACAACCATCTT 



301 CA(UTTCA<rrCCrrTTCTCCACTC(UCCCCTrrCAACCCCCAArrTCACAA(yiTC^ 400 
CTCTAACTCACCAAACAOCTCACCTCCCCAAACrrCCCOCTTAAA U I U . i C 1 ACTTCTACTTAACCTATgTCGAGCTAAAAACTTrTTACCAATTCATTT 



(SEQ ID NO: 40) i 



401 TCAATCTAAAAGAAAATACACyu^CTTCTTTTTCCACAAerrrCCACACCCTAffrC^ SOO 
ACTTACATrrTCTTTTATCrrcrrTCAACJUUUUlCCTCrrCAACCrrCTCCCATCAC 

2 HVKEKTELVFREVAEASLSAHRCSGSVSVIAV: J4 

501 CAACTATCrrAGATCTACCCACACCCCAACCrrrCCTTCCCCTACCTtrrrCATCATATO^^ «00 
GTTCATACATCTACATCCCTCTCCCrrrCC<UACCAA(WCCATCCACAAGTACrrATACCCACTTTTACCA 

3S KyVDVPTACALLPLCVHHICEMRVDKrLEKYEA «7 

60 : TTAAAACATaJACATCTCACTTGCCATTTCATTGGTACCTTGCAAACACtrrAACGTGAAACATC^ 700 
AArrrTCTACCTCTACACTCAACCCTAAACTAACCATKSAACGTTTCrGCATTCCACTTTCTACA^ 

<blkdr ovtwkl: stiori>kvkdv ioyvdyfhalds ioi 

TCI CACTAAACCTACCAGCGCAJUlTTCAAAAAAGAAGTCACCCACTCATCAACTCTTTCrr^ 800 
CTCATTTCCATCCTCCCrrrTAAG CAC7CGCTCAGTAG7TCACAAAGCAACT7CA7TTATAAACA . . . ^ 1 . L 1 i * CCTTTCTCCCAAAAAC 

U: VKLAGEIOXRSDRVIKCrLCVNISKEESKHCrS 134 

BC : CACAGAGCAACTCCTCGAAATCTTGCCACACrrAGrCAGACTACATAACArrCAATATGrrGGTTTAATCACCATGCCACCT^ 900 
CTCTrTCrrTGACCACCTTTACAACCCTCTCAATCGGTCTCATCTATTCTAACTTATACAACCAAATTACrGCTAC^ 

135 PEELLc: lp£:.arldk:eyvclmtmapfeasse l*"* 

9C; CAtrrTGAAACAGATTrrCAAGCCGCCCCAACATrrACAAACACAAArrCAACACAAACAAATTCa^ 999 
GTCAACTTT rrCT AAAAGTT C CCCCCCSTr AAATGTTTCT CTTTAAGTTCT^ . . . U . . 1 AACCTTTATACCCAAATCT CC I Gl CACCCCCCCCAATC 

ubcike: FrAAcc:.oBE:cEKC : pnmplehtgcry aoo 
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(SEQ ID HO: 44) i CTXCTCCCXCTCCX Lllli *CCAgrAAgrh'ATrATTTA L:iU AATCASC« iOO 
(S£Q ID NO: 45) CXTOfcOGCTCXCCTCAAAATCCTCArrCAAATAATAAATGAAAATTACTCCCTCTT^^ 

(SEQ ID NO: 43) l TPSPLLAVSLt.PTfM0P0rLVLH01l.VC$LVIL J3 

101 ACTT»TTGCATATATACTTCTAJUUaTCCCXTrrTCnATX(UAT0t7TACCTCCTATT^ JOO 
T<UATAACCTATATATCAACArrrrTACCCTAJWkACAATATCTTACCATCCACCATAAAXTAAATCA 

34 LIAYlVVKIPrSTRMVaAILrSVDDEMEDAARS CI 

201 JkTt»nCCrrCACCTITrrATACTATCATGAA«riTATCATTCCA 300 
TACCCACaUWroauUOU^TATCy^TACTACrrCCAATACTAAtXrrAAATAAAATOCCCAACXAG^^ 

«lMCASPryTMMKVIIPr:tPVVLS VIALMrNSl.LT 100 

3 01 CTCACTTCOACTTATCTCTATTCCrrTAraTCCCCTAGCT 400 
CACTCAAGCTGAATACACATAACCAAATCCTAGGCCATCCAtnTCCTAATCCATAATCCTAAGCrAGACCTCCACTA L 1 . i U 1 4 I ACATTA O* It, 1 

101 DPDLSVPLYHPLAOPLCITIRSACDCTATSNAO 133 

* 0 1 ACCTCTCCTA I U b i 11 ATACAA . i L« I . L 1 U ATGATTA 1 > : L 1 UJ UICCCTATTATACTTCACACAAAC ACCCCGGCCTAAACTAACCAAAT AATCATCA S 0 0 
TCCACACCATAAACAAATATCTTAACAAGACTACTAATAAACACCrrCCCATAATATCAA G 1 U 1 b 1 a i C 1 U CCCCCCCArrrCATTCCTTTATTACTACr 

134 ALVFVYT:VLMi:SC?VLyPTQRPC»KVRK« 1(4 

501 CAC CCACTACTCTTCCCTT ATCAAAT ATrCAAATA G TT G TCACGA i lUl U l ATCACTACTCATTCCTACTATAA i IUjI 1 l AGACACACCaACCAAATC 600 
CTCGCTCATCACAACCCAATACrrrATAACTTTATCAACACTCCTAACAAAATACTCATCACTAACCATCATAT^ I C I L i CL L I L U U i AG 

& 0 1 CCAOCCTCCACCCATCCCAACrr AT ACTATTGrrrOTrr ACCTC CATSTTTCATTATCATOACCAATGAATACGTATCTTAT AAATTTCCCACACGAGAT 700 
CCTCGCACGTCCCTAOCCTTGAATATCATAACAAACACATCCACCTACAAACTAATACTACTCCrrACTTATGM 

701 CCTACACCATTACCACCTCAACTTAT ATCACCTGTCCCTTrrCTACGCCrrCGAACCATT rrrATTACAOATAAAAACXAAATTACACCTCTCACAACTC • 0 0 
CCATCTgCTAATCCTCGACTTCAATATACTCCACACCCAAAACATCCGCCACCTTCCTAACAATAATCTCTA l i 1 . I i l AATCTCCACA LIOMU AC 

fiO: CACCACCCATTTGCCCTTCCCCACCAArrCGATTACrrATTCCAGTACt*. 1 . . » ATCACCCAGCTL 1 . 1 lACTACCCAl 1 1 L i L . 1 1 UMGTCTGATATC 900 
GTCGTCCGTAAACCCCAACCCCTCCrrAACCn'AATCGATAACCTa^TCCAAAAATACTCCCTCCAGAAAATCATCCGTA^^^ 

90: CATCTTCCAACCACTAAAAWTATCTCCAAAATCGTTCrAAAATGArrCAATTCTATATAGTACTT i 1 AC »7B 

CTACAACCTTCCTCAIT ; I 1 : . ATAGACSTTTTAGCAAGArrrrACTAACTTAACATATATCATCAATTTACCAAATC 
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(SEQ ID NO: 47) i ou^TgreTTccccAACTrTTTxcj^ loe 
(SEQ ID NO* 48) CTTACxcAAccccrrcjuuuuiTCTrrrtrrACAACCACTrTTTrcTC^^ 

(SEQ ID NO: 46) i n n 2 

X 0 1 aiC»eGXTTXCAaUUUCTTC(UJUUlOGCTtgfcC Cl^ W ^ * iCCfcACCCCTTACATCKKWaCCXCTTC 300 

CTCTCCTAATCTCTTCTaU U.Ll > ; i LULA CCTCGXCXCau^CATCCA i a r C 1 Ci a xeAAATACCACAAAAAACgTrCCCGAATCTA C i i i : 1 LU 1 CAAC 

lORI aOtLEKCOAVVLPTETVTCLrSKXLOEKAVD )i 

201 ACCXTCrrTXCaiACTaUUU: C T CV: l C CTJUUCATJUkCCeXCrCAATCTCXATA ICULL: Li i i LU A CCXy Crr eeAC TTTT CAXX CAATCACCCXgC 100 
TCCTACAJUkTCCTTGAgTTTCCACCACCATCTCTATTCCCTt^^ 

37 HVYOLXRRPHOKALWLHIASriDILHrSItMOPA <» 

J 0 1 TTATCTACAAAAACTTCTACACA L^i 1 I i iu COUS ClLull lU ACCATTATTCTCGAACCCAATCACCCA i;! f CC C T ATTCCCTAAATTCTCACCTTCCA 400 
AATAGA IUI 1 1 L I U AXCATCTCTgCAAAAACCCTCCAGCCAACTCCTAAT AAGAC L 1 ILU. 1 1 ACTCCCTCAACCCATAACCCATTTAACACTGaAACCT 

70 yLOJCLVETFLPGPLTIILE~AKDSVPTMVKSDl.A 102 

4 01 ACTATTCCATTTCCGATCCCCACTCACCCTATCACACTCCATTTAATTCCAf^ 1 LL^ 1 1 L ATTOgCLLU I C lOCCAATATCTCACCTCACCCAA 500 

TCUTAACCTAAAGCCTACCCGTCAGTGGGATACTCTCACCrAAATTAACCTCTCTCTCCACCGAACTAACCCCC 

103TICFRMPSHPITtDLIRETOPLICPSAHISC0AS 1]( 

SOI CTCCTCTAACCrrrCAACAJU^TTCTCAACCATrrTCACCAAGA U. I i L IbUj i C 1 W ACACCATCCTTrrCTAACTCGACyXUTTCAACTATTCTCC^ COO 
CACCACATTCCAAACnTCTTTAACACTTCCTAAAACTanTCTCCAAOACCCACACCTT 

CVTrEOlLKDrOOtVLCLEOOAFLTCOOSTlVO 1<9 

£ 0 ; TrrCTCTCCACACAACCTCJUUUlTCrrACCCAACCCCCAATTAA^ 700 
AAACACACCTCTXrrTCCACnTTACAATO a; : I C C C C C TTAAT7TCC7C7TCTATAACAACCACCCAACK:TCTCrAAACAAAACTCCTCCCAACTTTAC 

170 LS COKVKILPKA0LHEKirLl.CC0Rri.LRRl.gH 303 

7C: CTAACACArirCCAACAAAMATCTCAAACCCATATGTMCATCAACCAACAGGCTrTCCCrTATAC^^ BOO 
CATTCrCTAAACCTT L. : .U TCTACACrTTCGCTATACACTCTAO. .OC : :CTCCCAAACCCAATATGAAAATCACgrCT LLl i I CCCCATCCCTTCATC 

20) LBOLOETDVKAICOINOEALCYTFSPEETASOLA 31C 

a 01 CT AGACTCTCTCACC AT7CCCATCATTTCrr ACTTCC CT ATCACCA7CCAC CT AATCA. U . w * » ACTTCGATATGTCCACGCTCAACnTACCAATCACT 900 
CATCTCACACACTCCTAAOCCTACTAAACKATCyU^CCCATACTCCTACCTCCATTACTACACAATGAACCTATACACCTCCCACrTCA*^ 

23? RLSOOSHHrLLCYCDAANHVLLGYVHAEVYESL 3C9 

901 CTATTCCJUUWCAGCATTTAATATCrrACCTTTACCACTTTCACCTCAACCCCAAGCTCAACCTAT^ 1000 
CATAA&; '.' : I * AAArrATACAATCCAAATCCTCAAACTCCACTTCCCCTTCCACTTCCATACCCATTTTCAAATCATCTTCCCAA CL M U 11 L U 

27C YSKACFNtLALAVSPOAOCOCIGKSLLOCLCOE 303 

1001 CCCAAAACAT C T CG TTATCCCTTTA ILLUL. i AAATTCTCCCAATCATCCICTC GC T C CTCATGCAT7TTAT GAAXAA U1 1 tA* CT AT A C 1 1 L. I b ATAAAA 1100 
CSCTTTTCTACACCAATACCCAAATACCCGAATTTAACACCCTTACTACCACACCCACGAGTACCTAAAATA^. . . . I CAACCCATATGAACACTATTTT 

JOJAKRCCYCF IRLNSAHHRLCAHAFYEKVCYTCDKH 33C 

1 1 C : TGCACAAA CU.. . . A77CCCAT A b.;iC A ...lL. 1 ArTCTAAAATCAAACTAATOACTACTCACACAATAAA G CACAAGACCTATCA : . 1 MU 1300 

A LU:l. . t CCCAAATAACCCTACAAAATCAAACTAAAACAA7AACAmTACT7T3ArrACCTCATCA U,UH»l l A l . IL^iH 1 CTCQATACTAAAAAC 

11- C K B r : R 1 F • 345 
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(SEO ID NO: ) 1 AAGfcTJU^TAOIUUUkTJUUATCTAACGJWTCJWyWUUU^ 100 

^ ^ ' TTCTATTAT C; i i l ATCTTACA l iULl i XCTCT L. . i 1 1 ACCCTAAXCCTCTXTTACCrTTACC M 1 . U . . U ATXCXXX L J L . 1 1 1 XTTtXAACXAATA 

(SEQ ID NO: 51) 

1 0 1 CCTCATTXraTOCTACTACCAACTTTATTaxyUlTTTTTCCAAC^ 300 
CCXCTAATACTXCXiATCATCCrrCAAATAACCCTTAAXAACCTTCAanTAACCACCCyUU^^ 

301 CCATTTCCCACCC L* , i I i i AXXCTGXCXXCAXXTXATGXCTX lun i I I XCXTXCXCCTXXCATTAACSTCAXCCCTCCTAATCCTCCCCXTCCTATCC 500 
CCTXXXCCCTCCCGXXAXXArrTCACr Ll IL.l i ArrACTCATXCAAAAATCTATCTCCXTTCTAATTCCXCTTCCCUCCXTTXCCXCCGCTXCeXTXCC 

(SEQ IDN0:A9)i MrLDTXItIKVKACKGCOCMV30 

J 0 : TTCC L i . lL U!L L.i CXAXXXTATCTCCCTAXTCCACCCCCrrCCC C r CC TCUTCGTGCrrCCT(XUCCCAATC^ 1 U I AaxCCXXCCXCTXCC 4 0 0 

XACCGXXACCXCCA L: l * l iA TXCACCaArrACCrCCCCCAACCCCAC(>CTACaiCCACCACCTCCCTTACACCACA»CCA^ 

31 ArRRCXYVPHCCPWCCDGCItCCHVVrVVDECLa S) 

401 TACCTTCUTCCATTTCCCCTACAATCtrrCXrrTCAACGCTCATTCTGCTtLAA^^ $00 
ATCGAACTACCTAAACCCCATCTTACCACTAAACTTCCGACTAACXCCAC i U 1 . CCCTACTCCTTTCCCTACGTACCAGCACCACyiCTCCTCCAATCT 

S4 7LHDFRYNRKFKA0SCEKCMTX0HHCRGAEDLR Sfi 

501 CTTCCACTACCACAAGGTACCACTCTTCCTCATCCOCACACTOCCAAC oi 1 1 l AACACATTrCATTCAACATCOCCAXCAATrrA iCUi lo CCCACPCTG 600 
CAACCTCXTCCTCTTCCXTCCTCACXACCACTACCCCTCTCACCtnTCCAAAATTCTCTAAACT 

STVRVPOCTTVRDAETCXVLTDLIEHCOErXVAHOC 130 

601 CTCCTOrrCGACCTCGAXATATTCCTTTCCCaACACCAAAAAATCCTCCACCCCAAATCTCT^ 700 
CACCACCACCTCCACCTTTATAAG CAAACCC CTCTC w. . . , i i ACCACCTCCCCTrrACACA ^ W 1 1 ACC7 L : 1 1 CC ^CT C L ; I U CACTCAATCTTAA 

131 RCCRCNtRrXTPKMPAPCISENCCPGOCRELOL 1S3 

701 GCAACTAAX^^TCTTCCCXCXTCTCCCTTTXCTACCXTTCCCATCTCTACGCAAGTCAACA C. * 1 i AA LIC . l ATTACCTCACCTAACCCTAAXATTCCT 100 
CCTTCArrrrTACAACCCTCTACACCCAAATCATCCTAACCCTACACArCCCTTCACTTCTCAAAAT^ 

IS4 ELKILASVGLVCFPSVCItSTLLSVITSAKPKXC 186 

B 0 ; CCCTACCACrrTACCACTATTCTACCXAArrrXCCTATOCTTCCCACCCAATCACCTCAATL^. I ILCACTAG CCOACTTCCCACCmTCATTCAACCCC 90 0 
CGCXTCCTCXXATCGTCATXACATCCTTTAAATCCATACCAACCCTCCCTTACTCCACrrACCAAACCTCXTCCC 

laiAYHrT T : VPK:.CMVRTOSCESfAVADl.PCL I EGA 33 0 

90 : ctxctcxxcctcttcctttcccxxctcxcrrcctccstcxcxtccaccctacaccrcttatccttcacatcatt^ 1000 
catcacrrccacaaccaaacccttcactcaaccacccactctacctcccattrrccacaataccaactctac^ 

scgvglctcflrhieiitrv:lhi:3msasegrd 35) 

loo: tccatatcaccattacctacctatcaataaa c a c ctcca u i l ! 1 acaatcttcccetcxtccaccctccxcxcxttxttctxxctxxtaacatgqacatc 1100 
xcctata ctcctaatcgatccatacttattt rrccacctcagaatcttacxaccccactacctcgca lx i g jl ctxatxacxttcxttxttctxcctctxc 

3s4 pyedylaimkclcsynlrlhespc::vtnkmdm jqt 

n C : r rrCACACTCACCAAAATcfrCAACAATTTAACAAAAAArrCCCTGAAAATTATCAT^ 1300 
GCArrCTCACT l.!, . : : . ACAA L. I L . I AAATT ^ . . . . l i AACCCA L . 1 1 l AATACTACTTAAACTTCTCXATCCTCCXTACAXCC C TTXAACXCCTAACr 

38->PCSOEKLCEPKKrLACHYDEF£CLPAI FPI SCLT 120 

: 3 : : CCAXCCAACCrCTCCCXACArTTTTACIfcTCrrxCACCTCXATTCTrACAauW^ 1)00 
GCrTCSrrCCACACCCrrS7CAXAATCTACCATC7TCAC7TAACAA l (. I C. l C i LTGC7CT7AAAAACGACA7CCTGCTCACCCTATA LLI ILl ILU CA 

»fOCLAT**lDATAELLDICTPErLI,YDESOHEEEV }53 
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1301 TTACTATCCATTTC^CtW^CAA^l^AA^ACC^^ U i L 1 l UAXAAA CTCATCAAACTCTTTAAT 1400 

JUTGJkTACCTAAA LlU-1 ILWLil * * iCCA JUUCrrTaATCACCXCTACTCCTXCGCTCTACCCATCJUUUIACPlCnT^ 

354 YYOrDrEBKAFEISItDDOATWVLSCEKtMKLrK 

1401 ATGACCXA Ll i lU XTCgTCATttAAIt-' H^ILA T SJUA CTTTA 1«<1 
TACTCCrrGAAACTACCACTACrrAaACACTACTTTCAAAT 

3«7HTHrO»DESVMItL 3»» 
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(SEQ ID NO; 53) 1 TtXKXTOCcirrrxjUiAXXAMTTGWJir l : L . . l I 1 1 i *TGXAfr»TTA CAAA TrA*ff\JUCXXACCATATTA7 100 

(SEQ ID NO: 54) ACCTTACCCCAATT LI U Hj I I AACTTTTA Ui 1 Ll 1 1 1 iUi CXTTCnrrrCXAAttAAAACACAATACrTAATAATCrTTA Li 1 Li 1 1 L , . 1 LL i ATAATA 

(SEQ ID NO: 52) 1 hi 

101 CG CTWU:AJUMWn'ACAACCAAJUlCCAXTTCACCTT CC I Q AATATAAA ll fCOT 1 ICCA TGACCATOTACAC C CT C T C TTATCa 
CCCA L i ! L m CTCAT C 1 i it 1 1 1 1 AACTCaUACCfcCrrAT ATrrAAACCAJUfcCCTACTWn'ACXTCT CO OACACAATACC 



301 AACCUCIUXrrCTTATTCCTCAATTATCTCCTCCTAAOCCrrSACCCTtUCTCC^ lU UtCTCTTAT CJUUl CCTTCAJUUUUkATGCCCA 300 

1 HjL 1 1 CCACAATAACaiCTTAATACACCACCATTCCCACTCCaACTCACCTACXACCTCAACGCAAACTTC^ 1 i 1 U SAA U 1 1 . i i 1 i XCCCCT 

3SHCCVI RELSAAKCEPEWHLEFRLXSytTrKICHPM 69 

J 0 1 TCCAAACTTCCCCfcCCACA L i 1 U i C ACACATTCA C 1 1 1 U kTCACTTAATCTACTACCAAAAACCATCTGACAAACCAC CLLL llLl lO CCATCATCTACC « 0 0 
A LU] 1 I CAACCCCTCCTCTGAACACTCTCTAACTSAAACTACTCAATTACATCATC Ol 1 1 UG CTAgA LiUH lU yTCPCCCAAGAACCCTACTACATCC 

69 OTUCADLSElDrDDLIYYOXPSDEPAXSWOOVp 101 

401 TCAAAACATrAAACAAA LL I i iLJ UCCTATCCCffllTTCCAaU«;crCAACerrCCrrATTTACCACC 1 H C CCCACTACCAGTCACAACTCCTTTAC S 0 0 
A L i i J 1 CTAA 1 1 i L 1 1 1 G GAAACTTGCATACCCCTAACG I ClI CCACTTCCACGAATAAATCOTCCCCCAACaCCCCTCATCCTCACTCrTCACCAAATC 

103 EKIKETFEKICIPEAERAYLAOASAOYESEVVY 1}4 

S 0 1 CACAACATCJWaMAGACTTCaUUUUiTTACCTATTATCTTTAaUUTAC^ COO 
OTCTTCTACrTCCTTCTCAAC Ul i 1 1 l AATCCATAATAOAAATCTCTATCTCTAAOCCCTCACTTCCTTATqOCTCTCAATAA Ai ULl l ATQAAACCCT 

IJSKNMKEEFOKLGI XFTOTDSALKEYPOLFXOYFAK 168 

fi 0 1 ACTTCCTACCCCCCACAGATAACAACrrCCCACCCCTCAACTCACCACTATCSTCSCG^^ 700 
TCAACCATCGCCCCTCTCTATTaTTCAACCCTCOCGAarrGACTCCTCATACCACCCCACCTTCAAAATAGATCCACCU : : I CCACACTTCCATCTATA 

li* LVPPTOMKLAALNSAVWSGCTF lYVPKGVKVOI 301 

70: TCCACTTCAAACTTA 1 1 1 LLL I ATCAATAACGAAAATATACCTCACTTCGAACCTACCTTCATTATLt. i I I^TCACCCACCAACCCTCCACTACCTACAA tOO 
ACCTCAACTTTCAATAAACCCATACTrA : .ULI 1 1 l ATATCCACTCAACCTTCCATCCXACTAATACCAACTACT CC CICO: tCCCAGCTCATCCATCTT 

203 PLOTr FRXNKEK:C0FCRTLX I VDCCASVHYVE 234 

so; CCATCTACAGCACCAACATATTCAAGCAATACCTTACACCCTCCCATTGTACAAA T r T T T CC 7 T T CC ACCCACCTTATATCCCTTATACAACTATCCXAA 900 
CCTACATCTCCT5CTTGTATAACTTC3TTATCCAATCTCCGAC0CTAACATCTT7AAAAACCAAACCTGCCT 

235 GCTAPTYSSKSLHAAIVEI FALDCAYHRYTTION 2fi8 

9 0 : ACTCCTCTGATAACCTCTATAACrrCCTAACAAACCCTCCTAACCCTCAAAACCATCCCACnr^ 11 UU; 1 U CCAAAACCA C 1000 

TGACCACACTA7TCCACATATTCAACCA * iUi 1 1 CCCACCATTCCCA L i L . . CCTACGGTGACXACTCACCTAACTA CLl 1 lU AACCCACO Ol i 1 iU CTC 

2fi9 WSDMVYKLVTKRAKAOKOATVEWIDGHLGAKTT 301 

1 0 0 1 TATCAAATATCCATCTCTTTACCTTCATOCACAACCAGCCCOTGGTACCATCCTCTCTA 1 1 0 0 

ATAeTTTATACCTAOACAAATCCAACTACCTCTTCCTeCCCCACCATCCTACCACACATACCCCAXACCATrACCT .t^lU CTTCTG m CCCACCA 

lO: KKYPSVVLD5ECAIICTML5IAFANAC0HQDTGA 334 

:iC: AACATCATTCACAATCCTCCACATACCACCTCCTCTATTCTCTCTAAATCCATCCCTAAACCTGCAGGAJU^CCTTC^ 1200 
TTrrACTAACrCTTACCACCTCrATCCTCCAGCACATAACACAOATTTAOCTACCCATTTCCACCTCCTTrCC^ 

))SKK:HNAPHTSSS:vsitS:AltGCCKVDYRCOVTFN 3«8 

i 3 0 ; ACAAGAACTCTAACAAATCTCTrrCCCACATTCAATCTQITACCATTATCATCCATCACrrTT 1263 
TCTTCrrCACArTrrrTACACAAAOCGTCTAACTTACACTATCGTAATACTACCTACTCCAAA 

)«» KfSKKSVSHIECDTIIHODL 388 
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g«pl3S3 rif. 19 

(SEQIDMO: 56) l ACCrOCJU^TTTATtlACOUCTATCCTATCTT A AAW CG**WA C TCrrTATCTJaCTCgrTAT*ATCA»CTTCXAACTCX^ 100 
(SEQ 10 NO* 57) TCCACCTTAAATA L1CU1 1 CATACCAT AGAA i i IL* J ^ 1 i CXeAAATACATTCACCAATATTACTTCAA L 1 i IL A Li I >U.LUi i CAAATT ACAAT 

(SEQ ID NO: 55) ^ AOJYeOVSTl.ICtC»SVrLTRT»EVOTETATLIL JJ 

101 CGACCTA I t o lUUiU ATASCTA U I i LL W U i A Li L 1 U i Al 1 CI G I CA AT L J i CI ATA l U LU ACCAA 1 1 C CC CCCACATA 1 U U U iTTAAACGAATTT 300 
CCTCCfcTAACACCCCTATOakTCAACCAACAATCAtUJUUlTAAaLOiCnAfl^^ 

34GAIVOJA8SLLLPYSVNLX*yrCQritllDXLXItRIS <1 

301 CACGTTTACCATTTTTTGAAACACATCCTCACTATATCCTTACTCAfcTTTGCCACTTr^^ JOO 
CTCCAAATCCTAAAAAA H 1 lU 1 1 ACGACTCATAT ACCAATCACTTAAAOCgTCAAAACATAAACCAOCATC»CAGAAATAAAATTCCTCACCT 

«S CLRPrETHAOYHVSOrASPVPOASLPILSSftDL 100 



101 CCTCATTCCCTTCCTCA L i 1 i A TTA U i L : U CTAGCTACTCCA U 1 i 1 HA CCCTTTACCCTCAACCCCAGAAACAATCT LU iC i ; 1 CTATCACAATTATG 400 
CCACTAACCGAACCACTCAAATAATCACAAACATCCATa^CirrCAAAAC7X;CCUAATqCC^^ 1 L i 1 1 1 1 i AGACCACAAACATA C i t ^L l AATAC 

101 VXCLLTLLVPtASAVLTLyRQAQKESRVSHTXH 



401 AAXTOAAAATACCATGATTCAACTAAACAATATATCTAAAAAATTTOCAACCCgrC^ 1 1 I CACATACGAAtC l ' l l A 4B1 

1 i ILLl i i l ATCCTACTAACTTCA l IIL! I ATA7ACA I U w * AAACCrTCCCCACTCGATAAAACTCTATGCTTACAAAT 



114 



K C X * 



117 
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fiq, 30 



e^vn Tn no* >SQ^ l fr^i-rr-rt^ir^fTTXTKTrrKi i ia i ^ CTCTeeKmU^TATTCAATCCXTCaUUlTCTXTTACAATCGArCl lALii U lACTTCAXCATATCACCACTCC 100 

(SEQ ID N0.>3^^; X iniJJSJSiSJSTXTlIiTlISS^ 

(SEQ ID NO: 60) ^ ^ ^ ^ ^ 

(SEQ ID NO: 58) * 



S VYCrPrTYILrrFYLMIIKYFMJtLECRI»l.KSlIt ,7 



201 



jeHFTSFSfRLAALSTCIKTATLrLLIFLIAFSHOF 71 

10- TTAgCTTCT lL. 1 i GG AGATAAACC ACCTTCA 1 i li 1 1 AACACAATTTT A-KK:TAT AAgTATTCCAAACAATCCTAU 1 1 ILl I lATACCAi H U i HC7C *00 
IliSSuSASAASrSAT^^ 

72 SFSLEIKEVOrLRErYCtSIAHMASFFICFFFS 104 

401 TTATATACCATACTA l .. • ..ill I ATCCTTACrrACTATTACCA Ul 1 U iLi iU.i 1 l AAAAAA TCAAACATCACCTTACTM ULIOU lACTTTTTTA SOO 
AATATATCCTATCATAAACAAAAATACGAATOUTGATAATCCTCAAAAAGAACCAAATT^ 

105 YlAYYFFLSLtTISSPSWFKKSHMSLVFLFTFL IJl 

so: TTTCTACAATCCTTATT(TOyirrTATCAOTTCCACAATCCC^ 

AAACATCrrACCAATAACACCTAAATACTCAACCTCTTACCCTATTAACCTAATAACta^ 

IJBFVESLFWIYOLOMGIIGLLPIFOlfMVIISWPyALI 171 

MI TTTATTGCCTTACATTACrATCTATCATAATTCCATTCACTCTArrTTCT(^ '00 
AAATAACCCAATOTAATCATACATAGTATTAAMTAACTWCATAAAACACAACTATCTTTCACCrCCTCTCACA 

na YWLTLLSlXXPLTVrSVHRHHRRV* 19C 
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9«p47 ti9- 31 (Sheet 1 of 2) 

(SEQ ID MO: 62) l ACGGJUCXAGXAAATTTCAC I. l W . LU l U lT*TAfcTJUSJUUrTCTCTATATAJ>CCACCTAJaTCATCa:XCTTACT^ ,(,0 
(SEQ ID NO* 63) T LLU 1 1 U 1 1 L l U l AAACTCCAAAACCACTATATTATCTTCACACATATATTCCTCCATTTACTACCTCAATCACCTACCT^ U U i U I AAAATAOC 

(SEQ ID NO: 61) ^ « e l v h c i s t h r i g u 

101 AATCAXAAAACrrrAAAAOUUlCAAAATTACCCTCCtnm ^00 
TTACTTTTTTCAAATrrrCTmnTTTAATGCCACGCAJUUT^^ 

M SXXFKTNKITVRFTAP LSLOTIACHHLSASHLZ 4fi 

201 CArTcrT-xATrjLTJtTfrrArrr^Af^rTr-TrAAr.AT-rTni r^ i r .a r^ CTTiMgexCTrTATJureCTxgjLcxTATCTt^cr^ 
CTCACCATTACTCTACATCCCCT&AACACrrCTAAACT C Cr C I C T G AACCOCTCACATATCCCATCTCT 

4-7 TANOHYPTSOOLRRHLASLYGTDMSTHCrXRCQ 

3 0 1 ACCCACATTATACAATTCACATrrACCTATCTTCCTCATCACTTTTrAACTAC^^ 4 „ . 
TCCGTCrAATATCTTAACTCTAAArCGATACAACCACTACTCAAAAATTCAT LL U 1 i iLr CACCATTOGACACTCTXAAACCrraAACA i 1 1 1 Li 1 lU AC 

BO SHI lELTFTYVBOZFLSRKNVLTSOlLELVKETl 113 

401 TTTrrrCACCCCCACTACTTCATAATCGCTTTSATCCCCCCrrATTTCAAATTCACAAAAAAa^ 5*0 
AAAAAAgrCtKCCTCATCAACTATTACCCAAACTAQCCCCCAATAAACTTTAA t ILi i U 1 iUi l AACCATCCTTCAAATCCTCCACTATACCTACTAAC 

U4 FSPAVVDNCrOPALFCXCKKOLLASLAAOMDOS I4( 

501 TrrTTATTTTCCACATAAAOAATTCCy^TAAATTU . . . U 1 CATCATCAALU 1 L i 1 uaTTGGAATATACTCATTTACCAJUTCCTATTTTACCTtJAAACT COO 
AAAAATAAAAC(rrGTATTTCTTAACCTATTTAACAJUUkAA<rrACTACTTCCAOAA(?rrAACCW 

14T FyrAXKCLDKLFFHDCRLOLSySDLRHRILAET 179 

« 0 1 CCACAAACTTCTTArrCT7trT7CCAACAA777TTACCCAATCATCCAATACA t > a 1 1 1 : 1 LL I ACCTCArrrrAATGACCTTCAAATTCAAAATCTAT 70 0 
CCTCrrrau^CAATAACAACAAACCTTCTTAAAAATCtWTTACTACCrrATCTAA AC AAAAACCATCCACT 

ISOPQSSYSCFOEFLAHORXOFFFLCOFNEVEXQHVL 313 

1 0 : TACAATCATTTCCCTTTAAACCTCCAAAAGCACATCrrSAACCTTCACTATTCTCAACm 100 
ATCTTACTAAACCCAAATTTCCACCTTTTCCTCTACACrrCCAACTCATAACACTTCGAATAACATTATAC<^^ 

214 ESFCFKORKCDVKVOYCCPYSNILOECMVRKMV 24C 

eo: CGCACAATCCATTTTOSAATrACCTTATCATTACCCrrCrAAATATCSTGATCACCAAM 900 
CCCTCTTACCTAAAACCTTAATCCAATAGTAATCCCAAOlTrrATACCACTACTCCTTCTAAATCCCTACT 

247 C0S2LeLGYHyRSKYGDE0HLPniVliHGl.LG0P J79 

JC: CCTCACTCrAACCTCriTACAAATCTeCCTCAAAATGCTCGATTACCrrATACCATTTCJUVCT 1000 
CCAGTCACATTCCACAAATQTrrACAGCCACTTTTACGACCTAATCCAATATCCTAAACTrCACTCCAACTAAAT^ 

5t0AHSItLFTMVSCNACLAYTlSSEl,0tFS6FLRMYA JIJ 

1 0 0 ; CTCCTATCAATCCACAAAATCCTAACCACCCTCCTAAAATCATCAAT AATCAACTCCTO 1 10 0 
CACCATACTTACCTCTTTTACCATTCCTCCGAGa^TTTTACTACrrArrACTTSACCAACTAAAT^^ 

11* CIKUCNRMOARKMMKBOLLOLKKCYPTEFELMO )4( 

i;o: CACCAACOAAATCATTCCrrCCTCCTrtrrTACnTTrTCAAGATAATakAT^ ijoo 
CTCCTTCCrrTACTAAGCAACCACCAACAATCAAACAGrrCTATTACTTAGAAGTAACTAACTTCCAC^ 

"''£"»«»'St.-LSODN0SSLIERAYOMALFGKSS J79 

^li2!:^^tt!^!^En^T^^ "00 

CCTCTCAAATTTTCAACCTAACGTTTCCAACrrCTTr AACTCrrrCTACGATAAACATCTCATCG^ 

3IOACrKSWIAICLEO:DltOA:CRVAMWVKLOAl YFME 413 
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rxq. 21 (Sheet 2 of 2) 

1301 AACCAATACAATGACAXfc U* 1 ] U * . . i iU AACAJUUaTACTATCCACCTCTJUUUUaAAAACCTrTATCCAA C i 1 1 1 l^ CCXACCCATTCACACTTCCT I « 00 
TrCCTTATCTTACrCnTCCAAaUUUU Li i L* . U * ATGfcTACCTOgACA l 1 U Li i 1 1 CCAAATACCTTCAeCAAA LLU* I lUCCI JUCTCTCAACC* 



414 CIS 



41T 
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9«p<l rig. 33 

{^Vn Tn NO- 6S) ^ Oi 1 1 i I iU A CCA l I I CfcAAA ClCUl l ACCfcOUUJUUaCJUUnrCTCrATACTTCCJUU^ llLA aUTCTCX CI UUJr A I UA. . ! loo 

Wfcy iw. ^/ CAX*AAACTCCTAAA uUii CACCAA iLbiUlLi*UlCnU ^CCACATATGAAC CiI.Ltii AA*TAATCXWU U^ 

(SEQ ID NO: 66) 

10 1 TAOAGAAAXATTAAL. U H CCC HTCCTTTATOCAGAG U i I LL lU H 1 ATGCGAATCAACATTTACTACTOCAATCTTOCAAAritaCTCC C JUUU^CAAOT 300 
ATCT L i i 1 i i AATTCAACACCCTACCJUUkTACCTeTCCA*Ca^C**ATACCCTTACTTCTAAATeATCACOT i i l AACTCACC ^* i U iUi I CA 

(SEQ ID NO: 64) x HVYCrVPVTXWEDLVVlSOItLTPKTS 3( 

301 TTTCAAATAACCCACTGCaiCTTAAATAAACAACCAATTCCAgrATrrAACCTATCA^ j 00 
AAACTTTATTCSCTCACCSCaAATTTA U 1 L 1 1 LL i i AACCTCATAAATTCCATACTTTACTACTTAAATATCSACC CL i U ii lUL ' l AAAAATATACTAC 

SiPOXTCHRLHXOGtPVPXLSHROFIAXDKKrLYDQ to 

101 AATCAC«XrrAACTCCAACAATAAAAAAACTATCCrrACJaTCTCACrTTAAACT6TACAATACTCCTTATCATTTAJ^ 4 q 0 
TTACTCTCCATTGAC L. llL>U A iUUli CATACCAATCTTACACTGAAATTTGACATCTTATCACCAATACTAAA U U L U CACrTTASTACGAATAg 

SI SCVTPTXXKVWLSSDFXLYHSPYDLKtVKSSLS »3 

4 0 1 ACCTTATTCGCAACTATCAATCGACAACACCA 1 C 1 1 1 01 ACSAACGAACACAATTTCrACATATTqATCACCCTGGATCCCTACCTAAACAATCAACTTCT 900 
TCCAATAACCCTTCATACTTAG L m 11 C I U TT ACAAACA l L U LL U CTCrrAAACATCTATAACrACTCCGACCTACCCATCGA U i LI t ACTTCAACA 

94 AYSQVSXDXTHrVtCXEFLHXOOAGWVAXCSTS 13( 

SO I CAACAASATAATCCSATCACTAAACrTCAACAAATCTTATCTCAAAAATATCASAAA g A l 1 L J 1 1 L I L I ATrrATgTTAACCAACTCACTACTOCAAAAC iOO 
L M L i i CTArrACCCTACrCATTTCAA L U L i U ACAATAGA L i U i i ATACT L 1 i i L 1 AACAAAGAGATAAATACAA i 1 L U I I GACTGATQA LL M U L 

137EEONRHSKV0CNLSEXY0XDSF5I YVKOLTTGKE 160 

6 0 1 AACCTCCTATCAATCAACATGAAAACATCTATCCACCCACCCTTrTCAAA L I L , L U ATCTCTATTATACCCAACAAAAAATAAATCA uIl. i C i 1 1 ATCX 700 

TTCCACCATACTTASTTCTA L » i . 1 CTACATACCTCCCTCCCAAAACrrrGACASAATACACATAATATC LUl 1 L U 1 U 1 ATTTACTCCCAGAAATACT 

ISl ACXKOOEKNYAASVLKLSYLYYTOCKINEGLYO 193 

7 0 1 CrrACATACCACTaTAAAATACCTATCTSCACTCAATCX-iTrrCCACSTTCTTATAAACOkCAGCGA 100 

CAATCTATCCTGACArrrTATCCATACACCTCACTTACrAAAACCTCauWAATATTTCCTCTCCCTTCACCATCAGA^^ 

194 LCTTVKYVSAVNDrpCSYKPECSCSLPXXEOMK 33S 

8 0 1 CAATArrCrirAAACSATrr AATTACCAAACTATCAAAACAATCTWTAATGT ACCrCATAATCTATTSC^ 900 

CTTATAAOAAATTTCrrAAATTAATCrrrTCATA b. . . . ^. l AGACTATTACATCCACTATTAGATAACCCTATAATCTAAA H 1 lULI l AGACTACCCT 

337CYSLKOLITRVSKESDHVAHKLLGYYXSHOSDAT 3ftO 

»0: CATTCAAATCCAAGATCTrrSCCATTATSCSACATCArrCCSATCCAAAAGAAAAATTCATTTCTTCTAACATCCCCSCCAAC XOOO 
CTAACTT7ACC7TC7ACAGACCCTAATACrrTC7ACTAACCCTAC L . i i ILU U l AACrAAACAACATTCTACCCCCCCrrCAAATACCTTCSATAAAT 

36; FKSKKSAIKCOOMDPKEKLISSKHACRFMSAIY 391 

1 0 C : T AAT CAAAATCCATTTCTCrTACACTCTTTCACTAAAACACArrrnWTACTCAGCC AArrCCOUUWKT^^ H 0 0 
ATTACTTTTACCTAAACACCAJCTCAGAAACTCArrTTSTCTAAAACTATCACTCCCrrAACCCrrTCCACAAAOACAATT^ 

2»< KCKCrVLCSLTXrCrOSORIAKCVSVXVAHXXC 13* 

no: CATCCCCATCAArrrAACCATCATACCCSTCrrCTCTATCCIlCATTCrCaiTTTATTCTfrCTATTrTCACTAAaAAT^^ 1200 
CTACCCCTACTTAAATTCGTACTATCCCCACAACAGATACCTCTAAOAOGTAAATAACAAACATAAAACTCATTCTTAAOAC^ 

J?"CACCFKHSTCVVYADSPriLStFTXNSOYDTXSX ISO 

: 2 : ; ASATAGCCAACSATSTTT ATSACCTT rTAAAATCACeCAACCACATTTTTTAAATCATrrrrrCAAC AACCCATATTT CAAAAAC CfcTCCTAACCCCCTT 1100 
TrrA7CSCrr:rrA:AAATArrCCAACAT7T7ArrCCC77 GC TCTAAAAAATTTACTAAAACA U. ILl ILL - LI ATAAA U. 1 i i ILUI ACCATTCCCCCAA 

J<: : A r c V Y c V L r • j7i 
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(SEQ ID NO: 68 ) i TTGA*»»AT)iTTATCT*TMta*» Ct 3ACAT*TAAATCTAAdu^*CWCgrW I , . . 1 J iU; 1 ATACTACTA lil* K 1 I 1 AAAABAACCA IOC 

(SEQ ID NO: 69) AACrmTATAATACATA i 1 L 1 1 U. I b i ATAT7TACA I L L 1 U UCU CATTATAAATAATCCgCAAAAAA* C CATATCATCAT AACAGAAA 1 i i i 1 LC I 

101 grATCTACCTAATATCJUUMUVAAAAATCTTACCCTCACTTrrATTAACTACASTAAIU^i UL ILAACTACC IV* 1 1 1 lAACAACTCCCCATCCAQAAACC 200 
CATASATCCATTATA Ll ILi U i 1 i l ACAATCCCACTtllUUUTAATTCATCTCATTACCAAAGACTTCAT ro ACAAAATT^^ i lu : 

(SEQ ID NO: 67 ) * mjckkilaslllstvhvsovavlttamact 39 

201 ACTCATCACAAAATTCCn:crCAAGATAATAAAATTA(rrAACTTAAatf;CACAAeAACAACAACCCCAAAAAC^ j 0 0 

TCACrA LIUiUi AACGACCACTTCTATTATTrrAATCATTGAA iiUlLUiUilUiiUiiUiiLa«UliUik*ll C ^ 

lOTDOKZAAQDKKI SMLTAOOOKAOKOVDOIOEOVS (} 

J 01 CACCTATTCWUSCTGAGCACTCTAACrrCCAACCTCAAAATGATAG^ iOO 
CTCGATAACrrCCACTCCTCACATTCAACCTTCCA L 11 U ACTATCTAA lOi i LUi LI i AGATT Ll 1 1 U ACCTCCCACTCTAATCTCTTGAAAGA l U i 1 

64 AXQAEOSNLOAENORLOACSKKLECEXTCLSKH 

401 CA U U U 1 w 1 Lk* i AACCAA 10*11 U WUUUCAACCTCCTACTCCTCAAAeAAATCCACCCCTAACrACCTATATC^ SOO 
CTAACAAAGACCA l 1U*1 l ACOmC C: . i i 1 L I i CGACCATCACGA O 1 i i I* 1 M ACCTCCCCATTOATCGATATACTTATGCTAACATTTC M^i 1 1 lA CT 

9T IVSSMQSLEKOARSAOTMCAVTSYIHTXVHSKS 139 

501 ATTACACAASCTATrTCA CU 1 0 ! 1 O CTCeAATGACTCAAATCOTATCTCCAAA C AA CAAAA TCTTACAACaA CAAAAC CCAGATJ^^ (00 
TAATCT H ILO ATAAACTCCACAACCACCTTACTCACTTTACCATAaA LUHiUilUilll ACAA lCiiUUUlU I C CUlLi A l 1 i i i i CQATAAAQAC 

DOITEAXSRVAAHSCIVSANHKHLEOOXAOKXAI SC lit 

SO I AAAAACAACTJU:CAAATAATCATGCTATCAATACTCTAArrCCTAATCAACAAAAATn»CTCATCATC TOO 
> . . . 4 ^ . . CAT LLi 1 i 1 ATTACTACGATACT7A7CACATTAACCATTA 0 . i u: 1 * 1 1 AACCCACTACTACCAC I T C a iAACTCATG L t U U 1 C CU I L I lU t 

1(4 KOVAHHOAIKTVIANQOKLAOOAOALTTKOAEL 19< 

t S : AAAACCTCCTCAATTAACTCTTC CTCCTCACAAACCG ACTACCTCAACGCGAAAAACCAACC CTATTACACCIUkCAAGCAGCACCTG ACCCACACCCTCC 100 
rrrTCCACCACTTAATTCACAACCACCAC7C7TTCCCTCATCCACT7CCC L « « CCCATAATCT m U U : 1 CCTCCTCCACTCCU Z CT C CUACC 

19'>XAACLSLAAEKATS* 211 
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Fig. 24 

TfWES_8ACSU 



(SEQ ID NO: 71) l ATCTTAATTCCTTTATTt^TrATTTTCCciTACTTO 

(SEQ ID »0: 72) ^**=**"**CGXAATAACTAATAAMCCCCATCAACTATCCCTCCT 

(SEQ ID NO: 70) I „ u I A . t t I L X r L I 0 s r P s c L I V c K . A K c X 0 . . s „ 



IQO 



14 



TCCCriLU:a;nt;AAT(XCCCATCgrTACCTAA0(X3lTCTAACCCA^ 

°»0Mt-G*TKArRTLOVKACSVVIA0DILItGTLA 

;*g^^^g^TTTTCrCATCCATCTTCAT^^ 1 n ASCCCACCTCTTrCCaitCTTC^^ jfl. 

TT(»CCTAACCSCAAAAGACTACCTACAACTATAA0T(XC<3WU«yUUCTCCT " 

«8 TALPFLMHVDIHPLLAGVFAVLCHVFPIFAICPK 

301 a:CCCTAAACCCCTCGCGACATCACCAa«KTTTTCCrATTTTAC^ 

CCCCCATTTCgCCACCCCTCTACTCCIC a ;rAAAArnATAAAATtXXm»awau^TAAATA 

lOlOCKAVATSOOVLLPYAPtLFXTMVAVfriFLYLT 134 

4 0 1 CTAAAl i iU U 1 CI UlLTCATCCATCTTAACACCCATCTATACTCTTATATATA ti U 1 L I lit. I C CATCATACCTATTTATTC A TT G I CUr rACCCTCCT S 0 0 
OATTTAAACAAACACACACTAOCTACAA 1 10 1 CCC T A CATATCACAATATAT ATCAAACAAAOUiCTACT AT0CATAAATAACTAACACOUlTX»a«^ 

K'VSLSSMLTOXTTVIYSPFVHOTYLLIVVTLL 167 

501 CACTAl i i I i^*iUATATACA CACAC CCACOSAACATTAAAOCaUlTTATCAATAAAACACAACCTAAACTAAAATCCTTA $83 
CTtSATAAAAACACTATAiUiHUlUjCTCOCTrorA Al 1 iU.1 lA ATACTTA lii lUiLl lUJll llUii U i ACCAATATT 

1" TIFVIYRHRAIIIKRIIMKYEP)CVXI»L« X« 



wo 99/33871 



29 / 30 



PCTAJS98/27918 



tn 
o 
c 

O) 



c 



*i 0) 



Id 

0) 



1 



T 



1- 



-t 



O 
O 



a> 

JO 
Q. 

E 
o 



E 



u 



< 

Q 

d 



o 
a 



n 



< 
Q 
d. 

CO 



+ 



0) 

c 

0) 
O) 

£ 
o 



o 



IT) 

6 



m 

o 

CL 



wo 99/33871 



30 / 30 



PCT/US98/27918 




wo 99/33871 



- 1 - 

Sequence Listing 



PCT/US98/27918 



gvpIOJ 



(SEQ ID NO: 2) ^ TOTCATTTTT Cc*a»AA crrrTArTACAC»TMAAC«rrcT»^ ... 
(SEQ ID HO: 3) 

TATTTAACTCCXTTATTCCTACTCTAATCTAm 
(SEQ IDNO:l)l HRLDKYLICVSRllJCftRTVAHEVADKCR 37 

TACTTCCAATTACCTTACAACCarnTTCAACTTCCCnWACTTTCAATT ' ° 



28 I XVNCI I-AXSSTDLKVMDOVCIRrCNKL 



L L V It V L 6: 



^ ° ^ T*<^C*TCAAACATACT ACAA*^>X> CAACATCCACCACGAATCTATOAAATTATCASTGAAAC^ . 0 

ATCrCTAw * 4 t ^ iATCA IV, . 1 . i : i H.'U CTACCTCgTCCTTACATACTTTAATACTCACTrTCTCCCCAT L 1 1 L I li 1 ACACATTTTTATAACATCTTA 

*2 ^^tDSTKXEDAACMYElISETRVEEHV* 
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vjtv 4.U 4 w J. rrrr*cccAAACCTT*cACTCAtATCcc7xrrTSCGXAATACTTSCs^^ I ; xcrTTrr L^. . ^1 

(SEQ ID NO: 6) 



(SEQ ID. NO: 4) 



; 5 ; CG C CACW7CACTraTCAA77rr5CC»XACTAC7T»aUlCT77^TGAAA*CGAC^ 2 3 Q 

C' -LV. . L . . . XSTCAXCXCTTAAGACCCTrrCATaUkTCTTCXJUlCrA L,.L CACrrcrrrCUCTA*CCfcCTC7TTAC*ATTArgTCCAAXJkTAC7 

MKRT«RNSrVTHt»IT?PFlI 11 



: Z : TT3CCAATATTCAGATTCZCAATCSTA l.C^. < « l ACCCCCTATCC ^lOJ^^Hl ACCAACTCAS LL* * 4CUf *CeA TC 0CA*AAfl> CCTCOGJU;CTCCAC7 J 00 
AACCCTTATAACTCTAAOCCTTACCATCCCAAAATCCCCCATACCCACesCACTSCrTa^aTCCC^ L ULi CCACCCTCCACCTCA 

30 CHICIPNKTVLAPMACVTNSArXTlXXCLCACl, 53 

131 CCTTCTAATCCAAATCCTTTTrrSACAACCCAATCCAATACAACAAC S AAXAAACCCTCCATATC^ AACCCrCTCTCTATC »00 

CCAACArrA:rrrTACCAaACACTCT7C;rr7ACCTTAT >«* .w. i CCaACCTATACCAACTATACCTA C;uULL*L. ; > «U aGACACACATAC 

53 VVMtMVSDKCIOYNHCXTtHMLHIOtCEHPVS: 65 



C7TCAXAAAC aCTCCCT IrTTTT JTCGCAT^TCCSCCT^T^ ut^U^ CTGCCTATACCACCTATA U 11 C i ACCCCACCGgIc 

a&c . rcscc3SLASAAEr : schtktd i vd i hmgcpv ii> 



S 0 i 7=AXCWLAATCSTCAArUACCAAJC?CCACCTA7CTCCCTeAACCATtrrSACAAC^ A I. i LLl ILA TATCCC COG 

A Or r O: l . * AC=A C; ; w\r crrr;SAtrrC5ATACACCCACTTCCrA3ffllC7C7TCrACATCAaATACTA Ui lUI i CCAOCTCACA C AC^ 

130 NKI -.-KUEACAMWLKOPQKITSIINKVOSVLDIP 1S2 



1 0 i ACTTACTGTCAAAATCCCTACCC5rTGCGCCCACCCATCTCTCGCA<nACAAAATCCttTC3C?CCTaACCCTOCAO^^ 
TCAATCACACrrrTACCCATGCCCCACeCCCCTCCCTACACACCCTCATrrrTTACOCCACCCACCACTC^^ 



V K H R 



UADPSLAVCNALAAEAACVSALAMH 



700 
1«S 



1 C : CCSCCTACCCCTCAACAAATCTATACTCCCCACCCACA LL; :U ACA L . ACAACCTTCCCCAACCTCTAACCAACATTCeATTCATCCCCAACOgrO 100 

CeCCCATCSCCACrTCTTTACATATCAC LU * , U LU i ^ t Ui AACTCTSGSAAATCTTCCAACGCCTTCCACA i i H L * AACCTAACTACCCCTTCCCAC 

l»4GPTPE0HrTCHA0LtT-yKVA0Al.TKIPriAMCD Jil 

101 ATATCCCTArrcrCCAACAACCCAACCAACCCATCCAACAACTTCCTCCTSACCC^ »00 
TATACCSATCACAC G i i t. ! 1 CS J V . . . C CC7AC L . ; L . . SAACCACGACTCCgTCACT ACTAACCCCCTCCACCCTA LH. 1 1 1 ACCAATCCACAA CT T 

230 :RTVOCAICCIIlCCVCADAVHICRAANCIfPYI.PN 3S3 

101 CaUU^TCAACCATTACTTTT : fcfc* C »ClS*CIUU^TCCTACCTCA7TTO^CrTTTCAACAC^ 1000 
CCTTTAgTTCCTAATCAAA H i J^,^,.,. I t ACGATCCACTAAACTCCAAA H » ^ i 1 iH ACTTCTAC OCGA I UC i iU lU AA L 1 1 1 U CTAACTAATTC 

3S1 GlNHTrCTCflLPDLTriDKMKXATIHLllllLXN 3tS 

1 0 0 : CT fa A> ra fc C fcAAACC TCCCACTTCCTCAA l ICCIXmLt 1 LI.L 1 LL 1 L ACTA 1 L 1 LLUI UC AACATCTCCCSCTCCeAJaCTCCCTOQfUICXAT^^ 1 1 0 0 
CA P i i i ^, i L i r TTCCACCCreAfcCCACTT AACa cnC CeO AC C OAfiOaCTCATACACCCACCTTCTACA^ 1 i m ACCCXCCTOBOTAAACCC 

SlkLKOtllVAVRIPaCLAPIITtiaTSGAAKLKOAI SO 11* 



TTeCATCCTCQCATCCTCTCTAACTT C CCOACAACCTTAA LL . ^ U LL CAATTATCAAA 1 1 i iCCiU CATTCACACAATTTCtCKSAC RACTTACCOOCCT 
laOASTLACIXALLOLCXA* 1J« 



wo 99/33871 



PCT/US98/27918 



(SEQ ID NO: 9) 

: ; : taotacx : : 1 1 u *aa . u accta ^ ^ xyrcAecACATJLACCX CLL . , m : , . l ^ i u um crrc x: i u. i xTrsxTCXTACc»TJucccTA joo 

ATCATCTAAAACrTT ACCGJUAAACTCSATCJUtACACTO U. 1 LL « U I AT7CCT C CC*ACACACC»CT770kACTAACCXTAACT»CTXTCCT*TTCgOlT 

: : : CTCAccATCATrAXTCCAcrrAT ... aacattaccaataacttcacaaacca i I* * t : : i xtcaatatcttta LA SATArrcTCTai L : i >oo 

CACTOrrACTAATTACCTCAAT ACAACAAATT CT AATCCTTATTCAACrCrrr^ 

K: TTrT<>CTC CC:L>. .1 AAACCATAAC7CC7ACACCCCCACATTC77ACCATAACAAAATTCACCAA^ 400 
AAAACTCACCCACCAAATT-rr— AT-r^.-^A-j r-^-rrc^M.i^^^Tr^-rrt.-rTr-r^T-ri.^r-rrrrm 

*Zl TCAeCTTATCTCTrCATAACATAAAACCAACAATTJn'ATCrTCGCTGATATACCATrrCTCCCCATTATC^ SOO 
ACTCCAATACACAGCTATTCTATTTTSCTTCTTAACArACAACCCACTXTATCCTAAACAO^^ 

S:: TTCAACT': ; lUrUATTTTCA7ASrTrrATTA7AACTCAAAATCTCJ^7AAaiTACC«rrATCAATCTCI^^ 400 
AACT7CAAAACACTAAAACTA7r3ACATAA7A7TSACrrrrACAC7ATTCTATCCCCA7ACTTAGACTTTCAL W ATGCTAATTTTTACTTC 

(SEQ ID NO: 7) : h h l « v k o it i p l jc i k t« 

i : : C3CATCCCAATTAACCCTCACCCAA7CGCrTTTTXC=AAAAAACXTTXrrCTTTCTACCX^^ 700 
CCCTACCCTTAATTCCtA;TC::rr7AG;;3AAAA7S^. . . : ; ICIAATCAaAAACATCCTCCTCCACACTTTCSSCrrCTATACATAACACTCTAAtCJU 

-^»**C:nCEC ;CrYCK7LVrVPCALIlC£DI rCOITJ 4I 

- : : rrATTACAC CCAA ,. . .-^.l ll>AACCAAAAr?ArrCAACCTCAACAAgAACTCTAAATTTCSAATTOTCC»TC^^ • 0 0 

CATAATCrCCS! ICAAACAA C . * ,^ . . . . AATCACTT "CA5 1 lO: . CACATTTAAAC CTTAACACOCTACAACATCATAAATATTACTTACCCCTCC 

<» ^■■»^VEAK;.«ltVMItKSKr«XVP5CT:YIICCCC 11 

9:: rrCCCAAATCATCCArrrSCATTATCATAACCLACCTCSACTTCAACAreCXCTTArrTCATCAACre 900 
CACCCrr?ACTACC7GCAC<rrAA7AC7A77Ca7CSACCrCAACTTC7S;CTCAATGAACTXCrrrCCCCALl * . . . 1 AA*CCACCACCTCCTA7AeTTTTA 



s: c 



I"HUHyCKC:.tritTOLl.HOALKItFAPXCYtlf 



III 



»3: 7AT CAAA TTCCT CCAA CTATTCCAATCCACSAACCAAAATA7TXCACACCTAACTTACAATTTeACACTCCAAAAT^ 1000 
A?ACTTTAACCAC C 7TCATAACC77ACC7r ,. . > . A7AATCTCrCCATTCAATCTTAAACTCTgACCriTTAAATTT77ACTCa u; i t LU#^ ^ 

::4ttipr7 icwotPitrTiAitLorOTiiitriNOVKACi, 141 

100: TATAT gCfcCAAA»C TCTCfcCTAlTTACTAeASrrCAAACA HU.L lUL ! ACAACATAAfMAAACeeAACTCATKgTAATeCCTTACa^CAATTAm 1100 
* • *TACU«H 1 1 lUACACTCATAAATCATCTCAACTTTCrCACCCACCATCTTCTA i XLLU iU »TTCACTAACCATTACCCAATCgTClT*ATOiUTC 

U» VAOOSMTLVKLKOCLVQDXCTOVlAMftLABLLT 111 

liC; T7»TCAC CW» TT CCAA TCXCCCATCACAC AAAACTTCTA g; 1 C 1 C LU I ACTATTATCCTCCCACCCCCCACJ^^^aA CCm CACCT^eACA^TXTTA^ UOO 
AATA>,IUilCTA*BCTTACTCCCTA CIC.L T 1 1 1 U VACATCCACAC C CATttATAATACCAOCCTCCC LLL 1 LI 1 iHUU -C m T CCA ACTCTAATAATAA 

Itl TMOiriTOEKRVLCViTinVIIIARXTOOVOIIt 3U 

1301 °^ *^**^^^ 'r*CpT**TTTXACTCAArrCCTAAAACA C 11 0. M AAAfiATTTeeeACAAOrrqTCAeACTAgCTgfrAATACAAATAexgeT^ 1100 
*AATCTTTOCCOCTCCAATTAAATTa*OTTAACCArrTTCICAACCAArrrCTAAACe<^^ 

SUVTVIOLlfLTOLVKCLVKOrriVVTVAVIITirTAKT lit 
CCTCACTCTArATACCAUl > > iHUlLH.-.AATACAl.LLHXUtiLlLfcTAALi IH ILLACATCACTTAATACTTAAAASTCATAOOOGACCTCBAAA 
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»» V e T , G f X r A , , V , T t . 8 « B I , , , * , e „ » , , , , , , 

"= ' ' " ^ = ^ -" » ' ■- ^ » - = ^ V P . K „ V V : s C H V S t L A .1. 

*" " ° " • ' " ' = ^ " " ■ 0 ' V 0 H , P H T A . I . A V V « L I „. 

A. .CTTTTCAAATTTTTT^.^rTC. . .:y*ACTrr:rTOACATATTAT(»TTe7CAACrTTTATTOtTaAeTCe*)IOCA^^ 
««» 7 « V • 

4S> 

.-cs>,AAwreTOC^TTCTO»C^AACCTTA^^TGrCT<»TACCATACAAeOeeAACeTTC^^ 
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fSEQ ID NO: 11) • ^^^^^ : xtttatcttaccaaatttccctcxaattacctactaccataSw^ xcToecmAAAexcccTATrrcMATTCAC xac 

(SEQ ID NO: 12) 



i:: rrraiSXCCATCTJlCCATGGXJUUUl T L w . l ATJUlTAATtaAAAACGJkCMCCCeXTCCXCX*rt»TTTTATTAATXCW;*TGATC^ 1 I U kA 203 

AAACTCTSSTXeUTCCrX L XCACAATATTXTTA L. . U * i CCeeT M.X. . U 1 . CTAAAATAATTXr L T L 1 ACTACTCCACTAACCACrr 



(SEQ ID N0:10) * 



MBXILLtCOOOVItc 



2 C 1 CXCATT nr/; * HAA rCC r C l C T CJ UlTCCCCATTTMXACTCCregTCCTACAACACTTTATeCAA U : . 1 1 C ACTCTA 1 1 1 U i i CACTCCCAA C C T O ^TCTCC 300 
CTCTAACCTTTTr ACSACACACTTACCC CTAAAinTCACCACCACCATCTTCTCAAATACCrr CAAAACTCACATAAACAACTCACCCTTOUCT 

KOISKMLSCWCrJlVVLVtOFMCVLSLrVOSCrHtV 4> 

)C1 TCrrCATCCATA lil^iML^^w. W AATCCrrATCX Cia::CI C*CCAAATCCCCAACA7T^ LMLUA CASA 400 

XGGACTACCTXTAACCAAACOCCAACAAATTACCAATACTGACCACA UiLL: i 1 AOCCSTTCTAAACCTTCCATCCATACTACXAAaUUaAAOCTCTCr 

SO LHOtCLPLFHCYHWCOCXRKISXVPIttrLSSltD O 

* 0 1 CCACCCTATSCATATTCTCATCCCAATCAATATCCCCa roSATCA L W .UiU kCXAAC LUi 1 i iU ^CCACCAC t^i 1 L. U l ASCTAAanTCA U^i 500 
GCTCCCATACCTATAACACTACCCrrACTTATACCCCCCCCTACTCAAACA Li LpC i i CGGAAAA C TXy^7CgTCCAACAAAATCXI^TTCau^ Cl LLLU AAC 

»> 0»K01VM*:HHCAOOrVTlCPrD00VI.LAKV0CL ;i5 

S 0 1 . . wLU > , . L H ATCX L 1 i i U CCCCTCATCACA U 1 1 1 U CTGCAATATC L I U« i U i I ATCCTCAATACCAAATCCATCCAnTACATTAT r H »00<XAAa (00 
AACCCACCAACCATACTCAAACCCCCACTACTCTCAAACCACCrrATACSACCACAATAOGACTTATCCTTTAOCT^ IL 

IKLRRSYCrCRDCSLLCYACVlLNTKSHOLHrOQOV I«» 

COl TCrTCAArrrSACCAAaAATaAATTCCACATTT7ACC t.m*. , A7TTCAGCATCak(»CAACXTCCTACCACgrOACCACeTSATGCCCaAA Li i lUU AA 700 
AGAACTTAAA C I I . C . 1 XCTTAACCTC7AAXATCCCCACAATAAACTCCTA CU t LU U U U f ACCATCCTCeACTecrCCACTACCCCCrrCAAACCTT 

150 LKLTIINErC::.RVLrCHAGKtVAJtDOLHR£LUN 1»3 

to: CACTCA L t- XTTCATCATAATACSrrS T J;u; CAATCTCC .>!(.Ul . iU CCTAAAAA Ul ICC ACCACCACGOATTCCTACCATTTATCTXrSU C CXA G 100 

S7CACTCAAAAACTAACTACTATTATCCCACACACACTTACACCCACO>AA C CCA i . i H CAA CVIC L I CV. 1 C L L'lAXCCATCCTAAATAC ClCmai I C 

IB) S0rriOONT;,SVNVARt.llKKtCC0CLVGriITK 215 



• 0 1 AAACCAATACCgTACCCATTCAACCATCCrTSXTTSSAAACXA' . rrACCC7ATCTCC5CT H,LU 1 ACT LL i H 11 1 1 ATCTAn 

. * ULl . ATCCCATCCCTAA t.: «LUi ACCAACTAA C^. . .U. , AAAAAAGATCCCATACACOCCACGCCATCACCAGAAAAATACAT X CU CG AAAflAAAC 

31«RC1CYCLRHA* 



90\ CCX U.L..UH.li A HL,U CA U. . . . 1 ATTTCgCACTCTACaAATTrA L l ( CiT rCT AU . i . 1 w. ILUUIUUUL U ILl AACCA lLllAl U U LA 1000 
CC?AAAGAACACAATCACAAACTCAAAAATAAACCCTCAGATCCTTAAATCAACCA O ATCAAAAAaAAGAACX r AA a ^ 
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(SEQ IB NO: I A) TJuua:^CJlCTt:cJa■ca>■caulaL L ■ ■ ^jirrrT.iccT f jv*^* * * f cmrxTzccM L ' , u . ca ttca L>... n, »>,fc oc)u> cccTAcakAcccTC lo: 

(SEQ ID NO: 13) " ATrrcTSTGXc r rr!: :c 7i M : : j t j aAAGCccTJUuu^TccxTrcTTTCW ^ * - iu ccatccttccgxc 

(SEQ ID NO: 13) 1 KOTCTTNTril-CKItAOMATFVlDrrKCTLATl. J3 

cwwcctaataaaaactaStctt^ 

J4 1.PI IfHLOCVSPtI rCLLAVICHTrPlfACrKGC 41 

201 CTXACC LluiuL CAACOUli uC I O CACTCA i i > ICCU A I 1 iUCJe CTA lU it-^^^^ '"^CCTTCCCATTA. )00 
CATTCCGACACCgrTCGTCAC^eCTaiCTAAAACCCTAAA C C C rraT*^^ 

tt XAVATSACVIFCrAPlfCLTfLAIirrCLSYLCS 100 



TATCATTTCACrCTCTAtrrCTCACAGCATCCATCGCOCCTCTTA > 4 « 
ATACTAAACTCACACATCACACTgr CCTAgCTAC CS C CC XCA AT 

M1SI.SSVTASIAAV 114 
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(SEQ ID NO: 17) - rrAAAOrrAAATTCJAT CXAAA CTJkTAJUUfcTTiUU^TSrrCTATrrTACXTCGC^ . iU kATATT M i . . i L^U ITCCTAACTCCJUCCTATC 100 

fSEQ ID NO' 18) <^7T7c:^TTTAACTTXCTTT7CXTArrrrjukrrr«aiJUUTACj^ » rrccT AccATTCxrcrrccATxc 

(SEQ ID NO: 16) ^ "RsiKLMALSyMCinvLMiirriuTCTrv }> 

: 0 i TrSCe CifU.LI iUG ACCCJUCTCXCrA lU^.I XCrTCIUCTCAgrCCXaiCTX . . . .^. CA I . . . iULLLWlu amCTTXTCCTCTCTATXACTA 300 
ACCCCCCACACAA C C r CCC « * CACTCATACCAATCAACTTttACTCACCTCTCATAAAACACT A h Kh A G )t H rray; i>,AA CC TTCAATXCCACACATXTTCAT 

50 *«VLOIlTDrCTfHSVDT:i.Srri,PrATtCVTHY C3 

jo; CSgrrrAACCCCTATCACTAATCTCAACCATAACAAAXAACATCTTAACACAA LL. « , ILi A U. LU . . . lAl 1 lUlU CAT OH.! lUlA CCA l U iU lCC }0 0 
CCCAAATTCCCCATACTCATTACACTTCCTA 1 1 1 CTASAA I i U « U i * U IAAAACATCATUUUULIUITAAACA CCT A CO CAACATCCTAAAACTCC 

CLRAISWVKDHKItDLIIRTrSSLfYLCXACTILT >$ 

131 A CTCCTCTCTATATCCTACCCTAT : C T ^ . ^ . . ^ . . . ACTOATAATCCAATCCTCAAAAAC C TCTA LL i i O i I ATCCCGATTCAACTCA 1 1 CCCO USATTT 400 
TCACGACAGATATACCATCCGATAiyiACACAAaAAATCACTATTACCITACCACTrrrrCCAaATM 

»*T*VyiLATfPLrrTDMriVICItVTl.VHCI01-lA0ir 139 

43: rrrCAATCCAATOCCTCAATCAAgrrrrCGAAXATTACA Ul i i ACAAAACTCC 460 

AAA5TTACCTTA CCCACTTACTTCCAC A CCTTTT AATCrCAAACACAAAATCTTTTCAOC 

•JO S 1 t w V N E A t H Y S f S r T A L 141 
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Crn in NO* 20 > * CCTCCCXrrTXCCCTCXTCC\TTTCfcCrr)irCTX*TC» . l . . l fcTCCXCfcXLt* I a JkCXCCACCXCCACCAATCTA . U . . . 1 01LA CCXST7SCTXTACA 100 
JCy n*-** ^v/ CCACCgTAAATCCC»C7XCCT*AACTCCATACA— ACTAAAAATA L^ 1 * iU CACCTCT LU « L^. w C7CC7TACATACAmAC*CTCC7CAJUX*T»TCT 

rSEQ ID NO: 21) 

1 0 1 OCCACTACCCATCCW»TTCAAXAAACTTTT AACCOCCACTCTrCCTXTt»aUCCT^ 200 
CCCTCATCCCTA CV* I L I AA ^* I CAAAATTCC J C C 1 LA GACCCATA ULU > i CCACATASATCA LU* 1 1 U a^CCTAA LLU f 1 ACATCTACTATACTCA 

SEQ ID NO: 19 )i HOioitsritcospYCKi.yLVATPiOMLDonT jo 

2 0 1 1 . : lU CTATCCACA H. 1 1 U AAAGAACTCCACTCCA l I C CT CC T G ACCATAOCCCCAATACA U^UL 1 1 11 ULiU mCCA l 1 1 IL ACA I . 1 UJ I C CAACC 3 00 

AAACeACCATA CCr C i CGAA ^> . I CACri^CCTAACQkCCACTCCrATCeCCCTTAT C T CCCa AAA^ 

Jl rBAlOTLKEVDwiAA. tOTRMTCLLLKHrOISTItO 

3 C : ACATCA C 1 1 i CATGACCACAATCCAAACCAAAAAATTCC7CATTTCA 1 1 . 1 1 L 1 ILJ UUkCCACCCCAAACTATTCCTCA CL I L 1 C 1 U A l U.LU > 1 1 I 400 

TCTACTCAAAACTACTCCTgTTACCrrrr AACGACTAAACT AX C CAAACAA H ULUi CC LU: 1 1 m TAACCACTCCACASACTACOCCCAAA 

fi& :SFMCHNAKCK:PDLICrLKAC0S2A0VSDACL 9f 

4 C : C CCTACCATTT CfcCACCCTCCTCATCATTTASTTAACCCACCTATTCAOCAACAAATTCC^ ; rr C TCACT C TT CCACCTA CCTCTCCASGAATT T CT C CC 5O0 

CCGATCCTAAAgTCTCCCACCACTACTAAATCAA7TC:CTCQATAACTC Ul 1 Ui 1 i AACgrCAACACTCACAACCTTCCATCGACACgTCCTTAAAaACCC 

98 PSIS0PCHCLVK*A1E£EIAVVTVPCTSACISA XJO 

so: TTCATTCCCA GiGOl i : ACCCCCACACCCACATATCTT7TACC Ui 1 1 11 1 ACCCAaAAAATCAGCTCAACACAACCAA 1 j 1 1 i iliOLi CT H HUfcAA O UTT COO 
AACTAACCCTCACCAAATCCCCCTCTCCCTCTATACAAAATCCCAAAAAATQCCTrrTTTACTCC ^ 1 lU 1 LI ILUl i AAAAAACOSACA l 1 i 1 ULI AA 

X31 t lASCLAPCPHIPYCrLPXXSGOOXOrFCSKXDY 1«4 

401 ATCCrCAAACACAOArrTTrrATCAATCACCTCXTCCTCTACCAGACACCT^^ _ _ _ 

TAOGACTTTCTCTCTAAAAAATA STTACTCCAC7ACCACAT CL 1 L 1 U 1 U CAA LL 1 i 1 i XTACAATCTTCACATCCCAC 1 GCC UACCCM CAAAACCACTC 



i:: CGAATTCAC CAAAA TCTATCUACAATACCAAACAeCTACAATTTCTCAATTCCTCGAAACCATCirr^ 1 C 1 C 1 CAACCCTCAA I L 1 L M L lUATT 100 

CCrTAACTCCTTrTACATACTTCTTA I U, . . . ^ . ^m TCTTAAACACTTAAOOA LL 1 i 1 LU 1 ACACACTTTCCACACACTTCCeACT T ACfcCAAaACTAA 

I9i CLTXIYCCyOAC7:SELLISZSeTSLKCBCLLI 3)9 

8C: STT GAACC TCCCA CeAAA CCTCTCCAC CAAAA CSATSACOAAOA Ll ILl 1 UU AGAAATCCXACCCCCTATCCACCAACCCATCAACAAJUUTCAWCTA 900 

CAAeTTCCACCCT LU . . 1 CCACACC7C :CTACTCCT7CTC)UCAACAA 1 HI 1 ASOTTCCCCCATA U. I t-U 1 1 LLU i A L lU . 1 1 1 1 A l. IICU AT 

2JlVCCASItCVEEKOEEDLrt.E:OA»100CMICKMOAX a«4 

»0: TTAAC CAAA TACCTAACArrrACCACTOCAATAAOACTCAACTCTACCCTCCCTACCACCACTCOCiU^^^ 1000 
AATTCCrrrATCCATTCTAAATCCTCACCrrATTrrCACTTGACATCCCACWaATCCTCCTCA LLLl ILl 1 1 1 lUl iAl 1 ILLLI LI LILH ACATTATT 
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(SEQ ID KO: 23 ) i A iu.n lu^ ii x M*MMsimxM, r<Kri:rrnhj^^ i i ll lA CATxrrrnTAJUiarrrcTfcgsxc xoo 

(SEQ ID NO: 24) TACCCXACCXATTTTTTTCCACCCTTACGACXWTTCWarrTCAATAAro 

101 TTAXTTTCAAAOSTTTA U. i 1 U 1 LCT ATAATAGATTTATGaiTXIAXIUTATCJUUUUU^TrTCrCACCATTTCCGACT^^ 3 00 

AATTAAACrrrCCAAATCSfcACACCAIArrXTCTAAATACCTATTTTTTATACTTTTTTAI^^ 

(SEQ ID NO: 22 )i mokjcyekisoolcvtlkoidt 21 

201 (rrrCTAACnTTGACACCTCAAOCGGCGACTATTCCCTTTA^ j 0 3 

CAACATTCA^ACTCTCCACTTCCCCCCrCATAAOGGAAATACCCCCCAATAC CU 1 i LL 1 b 1 A CTCACCATCACACCTACTCCACCSCTAATTCCCATAAT 

22VI,Sl.TAECATIPflARTaiCDMTCSLDEVAlKAII S$ 

301 TTCATrTCCAT AAAAC TCT gACAAA TCrCXATCACCCTAACGAACCTCT^ACCT AAaOUarTAAgrTGACCAACC^UTTCaiUiaAAC C 400 

AACTAAACCTArrrrCACACTCTTTACACTTACTCCCArrCCTTCCACACJU^TCCATTCTAA C. 1 1 L 1 1 U i i LL ATTCAACTGCTTCCTTAAC L 1 1 C 1 i LU 

SS DLDKSLTHUNDRKEXVLAKIOEOCKLTKCLECX 86 

401 TATCTTACTTCCC CAAAAA TTACCACACCTTCAAGAACTCrAT C: ;CC1 l ATAACaMUUU CC CTCgT A CCAACCCAACCATTCCCCGTauiGCrCCACTC SOO 
ATASAATCAACCCCrrTTTAATCCTCTCCAA L . . ^ . 1 U ASATHaAACCAATA 11 LL 1 > , i CCCACCA 1 i 1 LLU U U> 1 AACCCCCACTrCCACCrOC 

89 :1VAEKLA0X'EELYLPY)CEKBBTXAT:AREACL 121 

501 :TTCCTCT7GC7CC7?TSATT7T CCACAA TATXqTTCACTTACA a»AA GAACCTgAAAA t< 1 1 L ' J r' J T Xr r UA ACGA T nOC& ACTCCCAACQAACCCTTOA COO 
AAACCACAACCMCAAACTAAAACCTCTTATAT CAACTCAA7CT L . . > L : :U CACTTTTCAACaiCACACTTCCTAAACCCrGA CCC llLLl I LUC AACT 

l22rPLARLIL0H:VDlEKEXEKFVCEGrATCKEAL7 15S 

SOI Ca^CC«rrTCATATrrrCCTCCAACCCTTX7CCCAACATCT0At.L. 1 CTATCACrrATCACCJUUrreCTOkCACXCTCTAAACTCACTTCTCA 700 

GCCCAarTCAACTATAAAACCACCTTCCCAA7XCCCrrCTACACTCCUACCCAACATACTCW«TA 

156 CAVO::,VEALSEDVTLRSH7YOEVLRHSlCLTfiO 

701 ACCCAACeAT (UAA (rrCTTCAT CAAAA CCACgrTTrrCACA7T7A7T*7CArTTTTCACACAeACTT^ j j U^L IL i C 800 

TCGG : : CC: ACrrrCACAACTAC H i « . CCAAAAACTr7AAATAATACTAAAAACTCTC7g7CAACCTTCATA LU i 1 LLU ATACCATCCAACCCACAG 

189 *KOESLOEROvro:YYDrSETVCTMOCYR7tAL 221 

901 AA7CS7CGCCA CAAA L . 1 U > > . i SAACA TCCCTTT7SAACA7CCCACCSACCCTA77CT?C C^ 1 1 C j 1 lU CTACTCCTTTCAACCTCAAAAATCCTr 900 
TTAGCACCCCTrrrrCAACCACACAACTTCTAGCCAAAACTTGTACCCTCCrrCGa^TAAGAACCaAACAAAra 

3:2NHCCKl.GVLK:crEHAT2B ILAFFATRFKVKWAY 2SS 

ICACCAATCCCTTAA GAAAAA CC7C. *^L^iLC7ArTCACCCTOTATTCCCACACAATTAACTtUCAAACC7S^^ 1000 
TATAACTACTTCAACAACTCCTTACCCAATrrrrrTTCCACAAC(»ACCArAACTC^^ l AATTCA LlLi U CGALl r UlLL -LlCC 

"* iOEVVOOSVKKKVLPAlEBRIRTELTEKAEEGA 288 
* * T*T5^:^* : **:-;^^Tfy^5^5**^'^'^-^^ IULICCACTGAAAOOCCO CC I GOI in iuJ ^TTTtakCCCACCCrrTCCTACASCTCeC 1 100 

atagcttcaaaaaa c ai. i w i * acaccccrrac acgacaaccaacgagcyaartttcccccccaccaaoaacctaaactcgctcocaaaccatctccaooo 
289 :oi.fsohlbk;,llvaplkcrvvlcpdpaprtga 321 

t7caatccacaccacctalu 11 u t ll w u 1 accacrc7tcactccaataaataccacaa i i i u*l 1 a ctocaccactttaocrtc 



122 KLAVVDATCXMt770v: YPV 



KPASARQISEAKKO 3S5 



7AAATCC?CTAAATTAACCAC7TATCCCACATC7CTAATAAC0CrAACCTTrA L Lll UCH^ CASCA (.l 1 1 LA CrTCCAAAACATCS LL 1 lU UmACTT 
IS* :.AOtlGOYCVEIlA|GMCTASRSSSAfVAEVLK 18* 
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TCTAAAOCUCrrCACTCCATAOUkTACCMTTATn 



Jl* 0fPtV$TVIVK«$CA«VrSJl5KLA«0trPOLTV 



1401 CWJUU CCcfcTCCOTrrCTAT 

L.niiu 



i U CCCAATTTgnrJUUUTCCATCCTilAercaAT I i U 1 LU> I atATACOU C 1 1 a « 

:r»CCACAACCCCTTAACCIiCTTTTACCTJU»*TT»CTTJU^ 



iaiCKISAISlAilLOOfLACtVKIOPKSXCVCOyOli «ss 



130X »CGATCTQCT T: fc n A*C*AAC TATCTtUIUC^^ . U. iLmA TACACTCCTTAACCAA C I r OClUIO UlTCTCAATACAeCTACrgr^/^r^ 
TCCTACACTC Wl C i i L . . i (A TACACTCrCACACCT C AA*CA*» C CrATtTOiCeAATTt^ 

«S4 OVSOKKLSEStOrvVOTVVHOVGVKVHTASFAL 4«s 

T£TTTg^gg?^gCTmCT CAACAAAAC TATCTC^^ 
ACAAAOTCTCCATCtUCCTCXCTTCTTrnUTAGACACTTTTA^ 

«•» l-SHVACLlirTlStMIVXY.ttEOKtTStAOXltIt Ul 



^ -t* I ATCCCTCA^CTACCAATATCCrrCATAATACAegAgTTrm rr^^>-.^q 

CAAOCACCA(UCCgrCCCrTC;:UGAAACTCCTCCCACCACCAAAfflUAaCATACCCACTTTeATeOT *" 

SJJVPRtGAKArtOAACfLlirtsSNILOHTOVHPI SS4 
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^ u^! nt\ ^ TA^ yuMwCWU, : AL^L.L. .^imTOTCfc*Q^ LI lUUkAXTmrOXlLmXACXTmcut^^ ..^ 

(SEQ ID NO: 27) ATC*cegccTTCcc»AACA*Tq ( aac*jKUC77Juai^^ 

(SEQ ID no: 25) 1 ? c * • V s T p V t » V K V r L I » c t V It r F 1 X t » c A X I » j, 

^***0«T»»VADIVI|ICVPr»«f»goC l.TV«TFTOfT «T 

CACKUTATTCrre^GAiSAACCCCCACauZJUUaTCICC^ 
<• ATltKSLCCAVLIIFTXIALOLTClASLMVIVTRT 100 

TAACCCCACAACCTAATAACACqCATTCrrCCTATrCTAACrrCAATAA CCn ^ i LU iU-iA ATACTATOlTAAAOCCAACTCTTATScUIiiij^ 
'•0«»>IVP«ltOICIKLIFTillDri!TX$VOll$VT$ IJJ 



AACCCATTATAArrCCCATAACTCATACTTTACCTtaTACTATTCTAAGTGAAACACCCCTCAC^ 
•^*'**'^ = »JITfO:C>«HRlHrVAT>«IITfPWMHVKDA HI 

OUAATACCCACTCCACCTACTTArrCCAAACTTAAATACrSTrrACrrcrACACrrCCAATTCT^^ 

t»» r I c t V c t • „c 



wo 99/33871 



PCT/US98/27918 



o«pissi - 12 - 

(SEQ ID HO: 29) i ccctctx*aagaaacc^^ 

(SEQ ID NO: 30) ^~==^»™TctAcrrrr»«ATAAiWAAcrJu»*AXTj«wc^^ 

(SEQ ID NO: 28) » " v v o - o r i > a » 

cuiuiui:uLva^Tec7»»rT>ccAAfuccT?rrrxTCTCT*fc nr; > rT t>Tmacr^ 
iirHXCvTiGPSFiitXALsrDwrTrcoDCvLotrv 44 

TT 0CCMOCW rr7Tr>r. A fcCCXAXJU CTCCTACCAATAC r U > C MAO I CA IU.4< CJUt WTATCATJUSCCAM X^^AAre fL^^ loo 
AALl^UUUI.CfcAAA i^..U^..aiU AaaTSC77A iUC«lU..lUl AgT*C LLL>.UilA TJUn'AT Lk^UU.iL.L>UU^ 

4S CKQVLSAKTATItTllltllMCIXTOSOAfKiVTTri 17 
TCTACTCCCATC^TACTATCAAATTrrTCACCAACCTAAATACrrCTCeauW^^ i LLi ACCfc CCC XAACTJ>JuaACCCTACrTC 



OOKSTMTLltTCWIltllCTMyTLOKOCOrOliXM 



* ° ' ^ITS^^flf lir^^StESTfEEif i xACC*rrxe;rTcfrAcsTXTCxi T: ;>* n > n i>i> c CTAAAAcc>gTO ,00 

TCTAACTCCCAXCCTCrcSATCCTCCACCAACCCAArrCCTAATTOGACAATCCATACTA : . : : L ! L : I LU ATTTTOrcaCTACCATCATlcATCTAS 
•**"«-fVCtLA»GliVKDTFLTt01tKLIlAArifTYL0P 14« 

cTccTTCxcccAcccTrrTt^AAeccrrrTrrfcrcAT^ 

145 *TCWOH'. CNKWYYllSSCAHVTCWTODCLTWTT 177 

CCATTTACT7 . . A^, . . C7A. . * CTCTCCAACCAACGTTCACrrACCATTGXCCATCATACCGATACTAACTCCXCOAAATCOXCXATTAT^^ 
171 :.»ACNCOMKTC»*rOVI»OHWyYAYOSCALAVIIT7 310 

^.CCACCAATCATCAA. . .-A . A. .*CCACrrAr==AArrCAr7ACrrCCCXrrAACATTTCACACTACCTAT0AATTCAAACATATTXTCCAm 
2i: V C C Y y L K Y K C C W V It • jjj 
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(SEQ ID NO: 31) : » » ^ ' « >^ * i i « 0 r , , d o t t l r l a o « r l h i r r i „ 

TACCrTC7TAT«:KTCCXTTTrTTTAXCrTt:T*C*«^ «0 

TCCTA**C*ACCTCTCT*CTCAT7MC*CCWTr«^^ 

AACfcTTTCrTCCACXTCTTCTAAACCCAAACAACCCTTAACW^^ 
100 * « I 0 V t M F * r L t A L I t T t T 1 L C C I V D H F 1 X t T l„ 



" ' ' " = * 0 ' * »• V V * t 0 S « K T tt L 1 t I . , K Y „ j„ 



= * ' " ^ ' * " ^ <- * V * ' I S , « K S I X . , . . T A 0 « l„ 

^ * " » ' " T » ^ 0 r 0 S K V K 1 A I f N H L t C 8 » C L S P t a» 

' - ^ " ° ^ ' - ^ * » ». s ' I 0 0 V p £ A V p « r V 0 r 0 c I 

^ ^AS^OGAA * ACTTTA^ » . . AA CTCt^ACTACCAAGCCTTATrCCAGATA U 1 1 VI UCU CCT 

= * ' ■ " » " ' « " 0 K I . t . II c t J t t V r » , , » 0 0 * . m 

•"'■•■'':«»'••<! T T S I L I .» 1 t D I 0 1 « . „, 
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(SEQ ID NO: 35) i AAAvsmrKTMrAcr xauiM 

ID yO: 36) ^*»c»tattat«tctttttat(u«>cctc=mcciw 

(SEQ ID HO: 34) I " x : r r . . , c , v c v t t . v i v 



TOTXCTCXTCTaWTJUlfcCCWCCACTarTC^CCC^^ »f> 

JO T t S T V T » V , 0 0 « V A I X I » f 0 I T O C V A » 1 C I H , . t S) 

" V » T 0 » . » » » 0 • » T D . t T X L 1 . , , , 0 I 1 1 T I t Ul 

"iSsnsssissisi^ ... 

' • ' « ' = = " « • - ' T . V , , 0 » , V , 9 , H » I . , » » , , 

*TTCcecc»(aCTe=TtrrTciT*Accccrrrc<»cT^^ "« 

« • V » » g . L » c A o K , . , V r » » , . . » , K o . ^ , o , ^ „, 

-..w^.i..,..«Amc^..A;:;7»ACCCT..C»C*TAgTCS;77S ». wmCI I HOUtCCBTACT MU. .1.. Um ACTAeneUT 

^ i. ««»« V 0 H T t I H ,, „, 

- T " « » <• B T L » T r » , , c , , T I r t , , T » » 0 V , 0 , ,„ 

- SS?S?iSISISSSSIS§SI?^ .... 
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(SEQ ID NO: 38 ; : Hn^T*?gg?°g^T*^T y^^ . i . w ii i . i L^J A^ m i j luij i . lu u^TTScrwAJuurnxTTJuauTJuij^^ 

(SEQ ID KO: 39) «**CT*TAccfceCTA?T7TA7ccawjrTA**ACcrr?T rrf- "o 

^S^Q^°«°=^^>^ . . . 1 c r , . . c . C C S , . . . X , , , „ 

^ * * t^TTtTHP'— ^ * * t*Cg?Trrr* l LI ACTTCTACI cIui l iA TATAfrATCfcCTTTaiTTTT r >C M «ICfc&TqAJU^ 1 a a 

TTAATACCXTTCATAOAAAAAnaafcAAATiUaUUUUUaTW "<» 

" • • • • ^ *CCT*TACTA^TCTTAT AItllAU e i l^AA CWTTO A^lLl lUiU^Li lU UlTAX^ 400 
t.Iv4 * ' **^CCCClLUlACCCrA n r>* nr ,WWAACAATCGATATCATCACJULTATJUCTJ^ax ^ i^ ^gCT^rrr-^^ft^^ gg^ j;;;;;^^ 

"OltMCSIIFNL;.LFLST$TVl$XLAOI»»'»IOL»ASVO 19 

*"* ^I:^lli>^iIITi5r^ f i : :;: ^' C^CTATCASTCSATTTTATCCCATAAATTrfrTroA^ iCo T A CngTCTT &aa 

;7TACAAACATAAATCAT»» > ^ fi fcUAACgTCATAeTOUICTJUUUiTAmeTA'rTTAAaxxACf-r^^r^.^r ^ <> 

SSgISniSS22?SiS ... 

***'*S:.tlir:iVKItfHYArL$PK».OVIIHOH»A 1S» 
• * ^^MAACAAATTACCA7TAATAATACr77AATAAACAA a uuCACATAATACTAA Ctt AAACATACACAAfc . 11 CA ACTTAACeAACT 

is» t V T F f H » „ T T c I : c c r c I « I A f t L r T r T R t „ « L A l.> 

^ ' ^ » * <^ ' V K t r C L I. r T 0 M « T A F P A : I A C A J 1 Y 2» 

ISIin?SSS?iii^^I^S^^SgEr^^^Hr^ C UCUATTCC7TTCA H i ILL 1 !, I ACTCATTOCAgTTCtU «00 

*^.*CAAATCCTCATAATTTT7CUCCrrrC5CAAA*CCCAATCATAACCCCAC^^ 

"* - ''''•*»*"»*'»'tStCVrAICLSFLF$10tCV« ]SS 

^T^??^*^^*°*^'^^^^^*''^°*^CAACCCArrrCTATCrCCCATeg^^ »' j ' l l VrfTTAAAA l L " 1 iL-UL- Ui iL i ^--^ rr^TT.- 1 

TACCCATGAAATCTCACAACATACCrTTTTSCCnAAACATACACC^ »" 

2»* « C T L 0 $ S « t t , I s I - D A C M A L f R 0 H F F 1. 0 . 0 r L T »» 

'''' sssrcSis^ »00 

* " ** « ^ ' ■ » / * ^ If H I H A H S L t I 0 T : L S » C : V C T I )22 

-fn?sinss^^ »oo 

- " " ^ » « ^ * ' ^ » t - « O « , 0 I S 0 K . , I X c L » t . r L 2,S 
T V V A V H C I r 0 L A L F - , 0 . C r t F L L V « C S , F L A U J.O 
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CSEO ID NO: : fc*OC*gT W» TCTCCrTOCT*CTTC^TT»TCWCT*TCCaT» . u. . U UUarregeTAftA IOC 

(SEQ ID NO: 42) • » * ^C77mAMeZAGauaQ»Mm>^CTKCT77ZX7XCGCXST»^m i H«LLL fcC»CC> C 7TCTCCCCe>TC? 



:3l AAAOTTTCrorrCCXCWCCT^TCtCCTACTJ^^ 

TTrCCXACXCCC»faTCTTCCXTCTAC»CCAT«TTC»TttCaiCTCTCTai L. 1 U I ULA CTJUUnUTACCSAJUtfTACaCTJUUUlCTATACCCTCT? 



(SEQ ID NO: 40) i 



200 



(rrCTAACTCACCX»yfcCA(WTCACCTCt;CO«ACTTCCCCCrr*JU«TCT^ 



''V'^t»'TItvrBtVAtA«LSARII»SOSV$VIAVI }4 

Jii?y*7"*"y=y*££E*CAcc^^ 

CTTCATA WTCTACATCWeTCrcCCrTTCCCAACSAACCCCATCCACAACTACTATACCM 

"TfVOVPTAlALLPLCVIIHXCtWllVOtrLIKTtA «7 

~ ***"^^ 'Cg*g^TCTC*CrTSCCATTTSATTCC7ArrrrS=AAACAOT ,oq 
AATTTTCTXC, . • . A^CrCAACCCTAAACTAACCATCOAALU . . . L . U:aTTCCACTTTCTACACTAAGTTATCCAACTAATAAACCTACCTAACCICA 

*«^tDiDVTllkL:CT;.0»»KVItOVI0tVDtrMALOS 101 

' " • C*trrAAACCTACCACCCCAAArT C AAAWCAACTCACCCACTeATCAA LlU ' . . « 

C . CATTTCCATCST. ...... AA U CACrCSCTCACTAeTTCACAAACCAACTTeATTTATAAACA l . .^1 . L. i UU. UUll> CCAAAAAC 

V«-*Ct:CK»SDtVI«CrLOVMIfKrCfjtMOF$ 1]« 

g ^ 'i» ^ gq^'^ =7SSf^TC7TSCCACUCrr7ASCS»CACTAOATAACATTCA^^ 1U,1 1 l AATCACCATCCCA LL. . i lU AngCTAegAgtCAC »00 
*0**CCCTCTCAA7CGS7CTCATCTATTCTAACTrATACAACCAAATTACTCCTACCCTCCAA(UC^ 

'*'--^:-»KtA»LDKItTVCl,NT«Aprifctft 141 

CTCAACTrrrrCTAAAAC r: C:a:CCiCCTTrrAAATCTTTCTCTT7AA Ul 1 L IL . i . U M i AACCTTrATACOCAAAT C 1 LL lU 1 CACCCCCOCCAATC 
lilCLr. t: ?'«AACCLO»tlO»ItO:PHMrLlHTCC«Y 30S 
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(SEQ ID KO: 4A) (r:AmcaGrtzxmrtxecimAJktsTTTXTrK^ i lu u^Ta^^TTTTg(T^J^Mr^^ r-^T^x 

(SEQ ID NO: 45) cxt«oqctcxcci«watcctc*ttc*aat^^ lo: 

(SEQ ID NO: 43) : • ' » ' t l * v s l r t r « o r o r l v l » o i l v c s l v i t n 
101 Jf^'TO^TATXTAffrrerAAXAATCcaTrrTrrrxTJusjuTO^ 

tTUAIAACCTATATATCAACArrTTTAOanAAAAaUTATCrTACCATCCAa^ 

>-**i'*v*«>''«»«"»"AitrfvooiNro*A>5 „ 

«'HCA«PFrTH«l(VIIrrXL>VVLSVIALIirillLtT lOQ 



«00 



WCnUACCTOUTACACATAACCUAATWTACCaiArcCAC^ 
^" °'»»'*V'»-^"»>-*0»t.ClTlII»A001TAT»II*o nj 

TCCACACaTAAACAAATATGTTAACAACXCTACTAATAAACACCTTCCCATAATAT^^ 

* *- ^ ' ^ T - ^ ^ t « I : s c T V L T r T 0 « » a » It V « , . 



SfSSSSTcSISSiSJI.^^ ^00 

^???^in?sss^^ .00 
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(SEQ ID NO: 46) i 



iD«l»0«LlltCCAVVLPTtTVTCl.rSEALDlRAVD )i 

TtWTAa^iarTTtiAcrrrecxccAcwTCTCTJi^ »o 
^ " * ?^*^ *c^****g TTCTAew LL 1 u cou w f c : . u xcaiTTATTCTrgiAc^eAATtTArr,-^. . .I : • 

" * ° * ^ ^ ^ >• ^ C » 1. T 1 I L I « D • V , T » V « , p u * 102 

T«IAACCrAA«CCTAC«CTCXCT£«a*TXCTCTt:ACCrA*ArT«^^ 



= * • ' ' » ' - « » f ■> 0 t V I e 1. t D » A r I T C 0 0 S T I V B Ul 

•«»o*reTCTSTTce»erm»c**Tec3TTecceo7TMTraw:7CTT<:T»rA»^^ "» 

= e i. X t « I r t I c e 0 . r t t t . « „a 

». . . D L 0 C T 0 » . » : c : . H 0 1 * u 0 r T , . , , . T * S 0 L » ,» 

• >■ • 0 0 • » » ' i I c T I o » » , » , u I e Y V 1. » t » , t , I ,„ 

2ssiji?§??s2:?;:j2n?sni?g^ »o. 

'••»»'»''•*••*»«'»» 0 c 0 0 I 0 «. I t 0 c I. 1 0 c loa 
». A . , c G , c r I , I « , . » , . ^ e , ^ ^ , , J , , ^ ^ ^ ^ 



0 R R r I I I p • 
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(SEQ ID NO: 50) » ^^^t WAAA TAC^TCTJU«WTOQ>CAAAAATt»C&TTTtaa^ 

(SEQ ID HO: 51) "^^^^^^^^^^^^^^^^^^^^^^^^^^-^ATTiu^ 



100 



101 



..roTn„n..,"' ^^^^^^^^^^^^^^^^^ 

wrLOTAKIIVKlCMCOOOMV jg 

> C : *ILL,. . iLUlCUiU AAAJU^TA J\*!LUU A*TgCACCCCCTTO CO l S T a.lLA T & -T M , ■ ' " . . 

AA a awcau ; cAC7TrrT*TACAaaa77xccTCCCCCJu5ce^ «oo 

ATCCWCTACCrAAAOCCCATCTTWCACTA^CTTC^ »ao 

' *- ' " »• » « ' " * O « C « » O N T I 0 N I. C « 0 A I D t , ,4 

* ' ' *U . 1 LU : CUTCCOaaiCTCCCAAOenTTTAACJLCA'iTTgATTexAA^ L L L 1 1 L.LL L Xi-flTT.- , « . 

CAACCTCATt^TCTTCCATCCTCACAACCACTACCCCTCTCAC^^ "0 

' • " ' ^ ^ ^ » * ' ^ « V L T D I. I , H C 0 t f I V A H 0 C I» 

• " = " ' " ' * ^ ^ " ' * ' t 1 , t « C . » C 0 . « t I 0 L ISi 
' " ^ * ^ V C V C r » « V c K S T L L S V I T S A r p E , c 

?s?ssni?gs?ns?s?iin^^ «o 

* " " ' " ' * ^' ^ ^ = - - » ^ 0 S C t « r A V A D L P C t X t C A »0 

' ^ ^ • = ' <■ » »• » « » T « V I L M I : 0 « 8 A I . 0 i 0 »| 
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i»« TT0f0trCKAft:$i000AT¥VL«0f|[t «KLfK JC( 

1*01 *T CfcCaUC TTTG*TggTCfcTaUlTCTC70lTaUUlCrTTA 14* I 
TACTOOTTCJUUhCTACCACTACTTACACAffTJkCTrnSAAAT 

JIIMTIfPOtDtSVMitL iff 
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(SEQ n) HO: 54) Ae=rTAcxB»^ Mc.i:iuix* xcTTTT Miiiu...auiu t uLlui T r ^ 
(SEQ ID NO: 52) i 



:tatmta 



108 



3 *«»"»»»K>XOLOtYXfcriOOVirvtfTO»CL }« 

201 AACC»AU«1U. ] Al iLUIl^TTATCl L L' iui.l JUtfWTeJUaCTC>CTmTCTTtt ^ Illl-I i iil xarr^^j ^ f.f,r~n , i , , . .-n-r-r^' 

TT5CTTCC«:*ATAACa«^AAT*0«IUCa*TTrcaCTCCa^^ «o 

* ° ' ^'^***CT T Crrf yr ftf 7 CW C TTOTCfcCfcCATTCACTTTCATttACTTAATeTAgTAge^^ I I I 1 1 ■ i , , .Ti-1-r.T-r. 

ACCrr7 WLLLI LUI C -l U W>aTCTCTAACTCAAACTACrCAATT»^ *" 

* -W^TTTOaAAACTTCCXTACCCCTAACCTCTTCCACTTCCACW^ »" 

•''^"''''•>CI'«A»«*TI.AOA.A0TrfevVY 



114 



CTCT7CTACTTCrrrCTCW«nrTTTTAAT^^ "0 

" " " ' ' ' « « >• « ^ J - 0 T D S * L K C T r 0 L f K 0 t f A K 1<« 

TCAACJl7«KCCCTCTCTATTCTTCAACCCTa»SACTT0A^ 

■^''^«»''«»'**»-»'»AV«fCCTftTVrEOVKVDI 201 



Cr.A«TCT«:T«TTCTATA*CTTCCrTArcSI?§^^ »00 
111 C C T A r T T • J « I L H A A I V t I r A L 0 0 A T « . T T T I 0 H SO 

2«» - » «» »» ^ » » t V T E . A « A 0 « 0 A T V . - , D O » L C A K T T 101 

IfI????i5ISiSSli?SS2?J^^ uoo 

ATAC77TATACCTAaACAAATOCAACTACCTCT7CU I UXliCACCATCCTACCACACATACeCGAAACOATTA CB T LLUUl it7!;uTlLlUl5SS cCA 
loa . . T r , V , L 0 5 . C A . C T H . , 1 A r A . A C 0 « 0 D T C A 

-ISSSSJSSSSK?^^ uoo 

in « p. t « - A r « T . . . , V « E , , * „ c 0 c E V 0 t « 0 g V T r M ». 

TCTTCTTCWfcTTCTTTAOACAAACecmrTAACTTACACTATCCTAATWTiOTACT^^ 
" " « « « » V t H I t c 0 T I I H 0 D L 
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(SBQ ID NO: 56) i *m xKK TTTm»caj^xTmmn%%% n ik» rj'/ hh r .%%i\ iui ■ ■ ■■, ,. : 

(SEQ ID HO: 57) «o*="***T*™sTOT*ai*Ti«uArrn^^ 

(SEQ ID HO: 55) 1 *«»»K0VlTLKCOtSVytTiT«tV0TtTATLIL „ 



CCTaUTAACACtXCT*TCCATCWawCAAT r ,>CAA^ ^ "0 

"«*''=»*««»-»-trT$».ti.Trtoriioxttittjs <7 



qTCgaAATOCrAAAAAACTTTCTCTACeACTCATATACeAAtamAAAmtr^^ "0 
" ^^''■''^THAOTNVtOrAtfVrCASLrXLtiiDU 100 

SOI OCWTTOeCTTCCTeACTTTATTA C. I L . i I L I ACCTACTCCACTTTTGAetMrrrrJLgggT i-* ^^--r^ ftOftAhOft M L I LUl L 1 1 1 L i L r11x^t>T^,.^^ 

CCACXAACCCAAOUCTGAAATAATCACAAACATTOTCACCTCAAAATO^ "0 

''''=*'^T^»-V'»-*«*VLTL»tOAOK«tllV|«TlH IJJ 
t^^S5*^^T^?^T?^^*^*a^TATATCT AA;AAA TTTCOAACC^^ 

* « * « w I ATCCTACTAACrTCATTTCTTATATAaATTTTTrAAA H.1 i mX ACTCCATAAAACTCTATCCTTAQiUUT 
1}4 R C R • 
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(SEQ ID KO;-59) i TgR:I5^iS?rint!T* ^ 't'^^J^i tu^ TATTe«TcefcTcw»TgTATT*w iMu,inM.i lu uuaTATGJuaaena: laa 

(SBQ ID KO: 60) ****'***T«IOTCATATAAAIAAaW»C»CC0CrTATAACTTA«:TJUan^ "0 

(SEQ ID KO: 58) i •» t i c * 



161 ACTJLTAl 1 UL . i i LLUl i UkO^TATATATT 
TOirATMCTaAAACCOUfcCrCTATATAr 




CTOAAATOemAAAATCAAAOTTrAATCCrrCiaOMTCATCTXCCTAJ^ **** 
"«rT«riritL*AL«T0tl»TATtrLLlFllAr«|iOF ti 



3 01 TT- 



nt S^ hi::: : : r:^T^^^S^ggF!^TTTT?T^^°^^^^*TAACTiTTOfcAA^ U l U i U iC 400 

AAT ai M MC AAA C CTCTATrrCCTCCfcACTAAAAAATTCTCTTAAAArACCATATTt^TAArmr^^ ^ . > . , . .^1^ 

4 0 : TTATATACCATXCTA ] li L i . . . . ATCCTTAeTTACTATTACa U, i , U 1 Li 1U.U lA AfcAAATCAAACATQUaCTTAeTA i 1 1 l ' 1 i I trfw^^; 
AATATATCCTATCATAAA«AAAATAaauiTCAAT«TAATc5j^^ 

ICS TIATTffL$tlT:jfr»«FItlt»l»lliL»fLFTrL 1J7 

AAACATCrrAOUATAACACCTAAATACTCAWCTCTTACCCTATTAXCCTAATAACaiTTAAA^ 
:3lfVE£LfW:YQLOMCIlCtLr:P0THVMSIiprALl ITl 

i 0 1 T7TATIU*C'nAeATTAeTArCTATCATAA77C«TTCACTCTJ H . , .LIl; T ^^ TAnAiA(-^«yvK?^,^^Af^^ff T^^^^? r^ -tan 
AAATAACCCAATCTAATCUTACATACTATTAAOCTAACICACATAAAiUaWIAA^ 

112 tWtTLLSlIIPLTVrSVMtMWIIIIV- i„ 
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(SEQ ID HO: 62) i ywi ft» rMo i,fcx* TTT» oL.i..(.uia T»TAATJuajur^^ ,^ 
(SEQ ID HO: 63) ^*^==™^^^^'*«TcaaAAcc»CT »o 

(SEQ ID HO: 61) 1 - « t v ■ o i i t ■ r , g 

^T ^l^^^ 1 i U M*rM* r fcA*A T TX LLLiLLLii j f xceecTaaTTXTCtCTCiiTxgaTTcaua^^ 

rrASTTTTTTCXAA A i . iOi . lUl i 1 1 A AT«;cm;CfcJUUlTGKSAOCT W JMSCCSA^ 
^* **"'tTIIKXTVirTfcrtSLOTIAOiIltl*,„tl 4» 

CTGAOWTTACTCTACXTCCOTOXfcCAITOTJUaCTCCICIt;^^ 
*' ^*»0"TFTfOOL»»IILA«LTaTO»tT»CfllOQ Tt 

TC(WTCTA*TATCrTJUUW:TAAAT«UT*CMCaWTACIC^ 
"***^'*<*T'TYVBOtftf«KirVLTfOItttVKtTL UJ 

AAAAfcACTCCCCSTCATCAXCTATTACCCXWr^C CC CKMTJUUlCTTT^ U i iUi l* *C»tTCCTTaUUT5fraiSATAOTlCT 

IIIII?IIIISiSItt^!T^^ ^' . . . 1 1 iuA TtUTCAA^CTCTTCIUlT^aaAATATAaWTTT^^ 1 1 lAUC r U AAAcf tOQ 

AAAAATAAAACCTgrATTTtnTAACCTATT TM CAAAAAA C TACTACrTCCACAW 

'^^^""'tDRLrritOtllLOUtySDLiBIIILAtT ITf 

iioPo»frscroirLAiio»iDrrrLcoriitviio»vL aij 

ijsjisiiissiniiisrssii^^ bod 

ATCTTACTAAACCCAAATTTCCACCrrrrCCTCTACfcCTTCCAACTCfcTAACI^^ 

''' = ^"«««covitvorcorrsiiiLotonviE»v m 

COJIttLCtllY«iKTCOlOIIL»NIvnilOt.LOOF J7t 
310 A H f « U f T M V . , « » c L A T T I • t t t D L r . 0 r t « N t A Jl, 

SS?2SSS????fI§5??S^ "00 

' " ■ « " " »• 0 A » « « H H « g t. t O t It K 0 Y f T t r t L M 0 m 



llOl OACCUCOUUTOAI 

TATTACTTACJUCTJuiCTAACTT^^ 
«tN|««»fci.LIOOIIQSSLXIlAYg»AtrcKf$ 



CTOOTTCCTTTA^iUi^^i^^ii^^ WOO 



^"^ ^C§?S?i??SI?Sn^ »00 
* " ' " ' " ' * " « 0 » « » * I C • V A « » V « t 0 A t T F « , . 41, 
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41« CXI* 



41T 
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!is s Sol iir ^^^^^2i§ff^ 

(SEQ ID NO: 64), « v , o . v , v r ... o . v v , „ 

**<tw w 1 » . * www^WOCCWTTTAi 1 i L U tL 1 i*.OCTC\TfcAXTrCC»TJiCrrT»Cr«CrrAJUTJlTCQ Ujau,i U 1 1 AU CrJUUUUlTAT*Cl*0 
3'r0lTIIItX.«loCIPVfKL«ll«0riAJtOE»rLlD0 to 

TTACTCTCCATTWWTTCTTATrTrrTTCfcTACCAATCTTJUWC^^ 

»» *» «0Vl|OlTNrVfC«tri.HI00* OI.VA«l,TS 12, 
"7 . , 0 H « H S K V 0 f « L S t K T 0 IC 0 S r I I T V « 0 L T T C « t 1„ 

ttcmc«tiu:ttacttcta<ttttctacatacctccctcc^^ "»0 

^ t S T L t Y T 0 « It I H t 0 L r 0 1« 



- = ■ • « » ^* « * * " 0 ' ' 6 » T « r e c t 0 s L F K K r o II It 3» 

« » i L « 0 L 1 T « V , « , , 0 « V A « H t 1. g T T I • « 0 . D A T ><0 

' « « " " « * ' « C 0 0 « 0 ► •< « « L I S • « » A O K r « t A , Y „, 

« c r V t t s i T It r D r 0 « 0 II I A « c V « V « « A II It I c »« 

C A 0 . r « « D T C V V T A 0 , . r I U , I r T E M , D T 0 T X S . »0 

»o. 

>«: : A ■ s V r t V L K • 
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( S£Q IB NO ! 68 ) J TTaluuuuTftTTnTrTAT^J^A n3A^n T?^T^MTTr^^A^^ 

(SEQ ID HO: 69) AAeTrrmxxAXAnxTx i ii. i ^ ul m i xm-rTKCx i it. 1 1 1 i.Lu^ TTKr^^rMl'n rmi. > ^ ^ * i T i-wt^^^^ ^Ii^jl^g;^ i i ii lh U- 1 



(SEQ ID HO: 67) ^ MXEKtbASLLLSTVNVtOVAVLTTAIAIT 21 

301 ACTCATOA CAAX Ai lUL 1 ULi U UfcffllTAATAAAATTACTjUcrrAAC W »CIACfcACfc « >*C CCr i> *U A CT ^ j^O 
TCACTACTCT7TT A * 0 C* C r» C TTCTATTATTTTAATCATTW UUIUilUilUXIUULXAU^ 

lOTDDKXAAQDIiltttMLTAQOOCAOROVOOXgBOVf |) 

101 eACC TATTCAACCTtaCCIlCTrTAACrr eCAA CCT aUUU^ TnATAaATTA ; I C I A AAAA 490 

QTCaATAAgtTCCACTCCTCACATTCJUlOCTTCCACTTrrACrATCTAA iUUU^lLiiA C AiiLAllU lCCTCC^^ 

** AZOAXOSttLOAgMDKbOAXSKKLSOIlTlttKV M 

* * 1 1 1 U i i 1 L i LU I AACCAA I U ( ■< . A* AJ>A CJ>il> C CTOCTAeTCCTCAAACAAATOCACCan'AACTACCTATATeAATJ U^ H u I A AACTCAAAATCfc ftOO 
CT»ACAA»a*OCATTCCTTACCXA H UUl iLU ACCATCACSA Li i lUli l ACCTCOCCATTCATCSATATAOTTATOTAACftTTTCAOTTTTJiqT 

*'* lVtRII0Sl.KlC0Allt A0TllOAVTSYXllTXVISKS lit 

501 ATTAeAflAACgTATTTe ^L.lUl it^L li^ riATgAgTg^ iL- fl i > T l i iLr> - pt^ ^^w. « > t ^^^j ^ j^jj^ ^ ^ . | 

TAATCTCTTCCATAAACTCCACAACCACCrTACTOfcCTTTACCATACA LL i HU A lU I U iA CAA i LU UI I 1 A ! UJU H.1AU i i i XLL ATJUAOAC 

UOITEAI SIVAANttXVSAMMKNLCOOKADKKAIfft X4J 



(01 
1(4 



AAAAA C AAC TA CCAAA TAATtUTCCTATCAATACTCTAATTCCTAAT C M C tfcAAATTaaTOlTCATCCrCA^ VOO 
■* * * LATCerrr ATTACTACSATACTTATCASATTAACCATTa C . I U > U U AACCCACTACTACSACTTCaTAACTCA i IM. U i U i LU*! L IHM 

KOVAWMDAlKrVtANOORLAODAOALTTKOAIL lt( 



TTTTCCACaACTTAATTO C AA CC A CC ACr J . . 1 CUHU ATCCA L . . LLL^ i . . . 1 LU 1 1 LLU lTAATl " 

19-> XAAC;.SLAAII(ATS* 
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(SEQ ID RO: 71) I ATCTTAATT^r^^..; ^ i I ULi i u itju^uu:lj h ilJ • ■■■ ■ 

..I 0 o « » V » , . 0 0 V I w , . , t L , , T « , » , , r , , I , I , „. 

*" ... 

1» » • » >■ » » H I , c I , T V , , , , , » „ o J , I ^ , , ^ ^ ^ ^ 

Bi niT «» « «««»CI»T>lUtLlUlUiUIJCITBlWTTTOeTTiU>,T«OTt »l ItlUILl lUUXl H LAI 1 1 lA eOUtOTT 



